This makes the Y position consistent with other instructions.
This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary.
llvm-svn: 328254
We already know all the of instructions we're processing in the instruction loop belong to no class or all to the same class. So we only have to worry about remapping one class. So hoist it all out and remove the SmallPtrSet that tracked which class we'd already remapped.
I had to introduce new instruction loop inside this code to print an error message, but that only occurs on the error path.
llvm-svn: 328142
We already have an OldSCIdx variable in the outer loop here. And we already did the map lookup in the loop that populated ClassInstrs. And the outer OldSCIdx got it from ClassInstrs.
llvm-svn: 328139
Summary:
This code previously had a SmallVector of std::pairs containing an unsigned and another SmallVector. The outer vector was using the unsigned effectively as a key to decide which SmallVector to add into. So each time something new needed to be added the out vector needed to be scanned. If it wasn't found a new entry needed to be added to be added. This sounds very much like a map, but the next loop iterates over the outer vector to get a deterministic order.
We can simplify this code greatly if use SmallMapVector instead. This uses more stack space since we now have a vector and a map, but the searching and creating new entries all happens behind the scenes. It should also make the search more efficient though usually there are only a few entries so that doesn't matter much.
We could probably get determinism by just using std::map which would iterate over the unsigned key, but that would generate different output from what we get with the current implementation.
Reviewers: RKSimon, dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44711
llvm-svn: 328070
Both vectors contain unsigned so we can just use append to do the copying. Not only is this shorter, but it should be able to predict the final size and only grow the vector once if needed.
llvm-svn: 328033
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.
This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.
Differential Revision: https://reviews.llvm.org/D43353
llvm-svn: 328016
This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check.
So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag.
A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found.
llvm-svn: 327808
X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET).
IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp.
The `nocf_check` attribute has two roles in the context of X86 IBT technology:
1. Appertains to a function - do not add ENDBR instruction at the beginning of the function.
2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction.
This patch implements `nocf_check` context for Indirect Branch Tracking.
It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks.
Differential Revision: https://reviews.llvm.org/D41879
llvm-svn: 327767
Remove the special casing for MRM_F8 by using HANDLE_OPTIONAL.
This should be NFC as the forms that were missing aren't used by any instructions today. They exist in the enum so that we didn't have to put them in one at a time when instructions are added. But looks like we failed here.
llvm-svn: 327298
With this patch, the tablegen 'SubtargetEmitter' always generates processor
resource names.
The impact of this patch on the code size of other llvm tools is small. I have
observed an average increase of 0.03% in code size when doing a release build of
LLVM (on windows, using MSVC) with all the default backends.
This change is done in preparation for the upcoming llvm-mca patch.
llvm-svn: 326993
The former simply makes more sense: we want to access the data here in
the backend, not information about the type.
More importantly, removing users of RecordRecTy::getRecord() allows us
more freedom to refactor the frontend.
Change-Id: Iee8905fd22cdb9b11c42ca03246c03d8fe4dd77f
llvm-svn: 326699
Summary:
Add a target option AllowRegisterRenaming that is used to opt in to
post-register-allocation renaming of registers. This is set to 0 by
default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq
fields of all opcodes to be set to 1, causing
MachineOperand::isRenamable to always return false.
Set the AllowRegisterRenaming flag to 1 for all in-tree targets that
have lit tests that were effected by enabling COPY forwarding in
MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC,
RISCV, Sparc, SystemZ and X86).
Add some more comments describing the semantics of the
MachineOperand::isRenamable function and how it is set and maintained.
Change isRenamable to check the operand's opcode
hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of
relying on it being consistently reflected in the IsRenamable bit
setting.
Clear the IsRenamable bit when changing an operand's register value.
Remove target code that was clearing the IsRenamable bit when changing
registers/opcodes now that this is done conservatively by default.
Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in
one place covering all opcodes that have constant pipe read limit
restrictions.
Reviewers: qcolombet, MatzeB
Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D43042
llvm-svn: 325931
This patch changes GlobalISelEmitter to rank patterns similar to how the
DAG does it (ie it computes a score for a pattern and adds the added
complexity to it).
This is so that the decision tree for GISelSelector remains compatible
with that of SelectionDAG.
https://reviews.llvm.org/D43270
llvm-svn: 325401
Summary:
This patch makes the decoder understand old AMD 3DNow!
instructions that have never been properly supported in the X86
disassembler, despite being supported in other subsystems. Hopefully
this should make the X86 decoder more complete with respect to binaries
containing legacy code.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits, maksfb, bruno
Differential Revision: https://reviews.llvm.org/D43311
llvm-svn: 325295
Summary:
Right now using a ProcResource automatically counts as usage of all
super ProcResGroups. All this is done during codegen, so there is no
way for schedulers to get this information at runtime.
This adds the information of which individual ProcRes units are
contained in a ProcResGroup in MCProcResourceDesc.
Reviewers: gchatelet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43023
llvm-svn: 324582
Summary:
Right now only the ProcResourceUnits that are directly referenced by
instructions are emitted. This change emits all of them, so that
analysis passes can use the information.
This has no functional impact. It typically adds a few entries (e.g. 4
for X86/haswell) to the generated ProcRes table.
Reviewers: gchatelet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42903
llvm-svn: 324228
Summary:
This is a bit of a reimplementation the work done in
https://reviews.llvm.org/D41446, since that patch only really works for
tied operands of instructions, not aliases.
Instead of checking the constraints based on the matched instruction's opcode,
this patch uses the match-info's convert function to check the operand
constraints for that specific instruction/alias.
This is based on the matched operands for the instruction, not the
resulting opcode of the MCInst.
This patch adds the following enum/table to the *GenAsmMatcher.inc file:
enum {
Tie0_1_1,
Tie0_1_2,
Tie0_1_5,
...
};
const char TiedAsmOperandTable[][3] = {
/* Tie0_1_1 */ { 0, 1, 1 },
/* Tie0_1_2 */ { 0, 1, 2 },
/* Tie0_1_5 */ { 0, 1, 5 },
...
};
And it is referenced directly in the ConversionTable, like this:
static const uint8_t ConversionTable[CVT_NUM_SIGNATURES][13] = {
...
{ CVT_95_addRegOperands, 1,
CVT_95_addRegOperands, 2,
CVT_Tied, Tie0_1_5,
CVT_95_addRegOperands, 6, CVT_Done },
...
The Tie0_1_5 (and corresponding table) encodes that:
* Result operand 0 is the operand to copy (which is e.g. done when
building up the operands to the MCInst in convertToMCInst())
* Asm operands 1 and 5 should be the same operands (which is checked
in checkAsmTiedOperandConstraints()).
Reviewers: olista01, rengolin, fhahn, craig.topper, echristo, apazos, dsanders
Reviewed By: olista01
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42293
llvm-svn: 324196
Summary:
Apparently, we missed on constraining register classes of VReg-operands of all the instructions
built from a destination pattern but the root (top-level) one. The issue exposed itself
while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped
into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained,
while nested VTOSIZS (or rather its destination virtual register to be exact) does not.
Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI.
https://bugs.llvm.org/show_bug.cgi?id=35965
rdar://problem/36886530
Patch by Roman Tereshin
Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan
Reviewed By: dsanders, qcolombet, rovka
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42565
llvm-svn: 323692
Collected statistics for the number of patterns emitted can be
incorrect because rules can be grouped if OptimizeMatchTable
is enabled. Increase the counter in RuleMatcher::emit(...)
to avoid that.
llvm-svn: 323391
This is a bit of a hack, but removes a cycle that broke modular builds
of LLVM. Of course the cycle is still there in form of a dependency
on the .def file.
llvm-svn: 323383
llvm::Regex is still the slowest regex engine on earth, running it over
all instructions on X86 takes a while. Extract a prefix and use a binary
search to reduce the search space before we resort to regex matching.
There are a couple of caveats here:
- The generic opcodes are outside of the sorted enum. They're handled in an extra loop.
- If there's a top-level bar we can't use the prefix trick.
- We bail on top-level ?. This could be handled, but it's rare.
This brings the time to generate X86GenInstrInfo.inc from 21s to 4.7s on
my machine.
llvm-svn: 323277
It appears that we haven't been prioritizing rules that contain nested
instructions properly. InstructionOperandMatcher didn't override
isHigherPriorityThan so it never compared the instructions/operands/predicates
inside nested instructions.
Fixes PR35926. Thanks to Diana Picus for the bug report.
llvm-svn: 322754
Summary:
This patch adds CustomRenderer which renders the matched
operands to the specified instruction.
Targets can enable the matching of SDNodeXForm by adding
a definition that inherits from GICustomOperandRenderer and
GISDNodeXFormEquiv as follows.
def gi_imm8 : GICustomOperandRenderer<"renderImm8”>,
GISDNodeXFormEquiv<imm8_xform>;
Custom renderer functions should be of the form:
void render(MachineInstrBuilder &MIB, const MachineInstr &I);
Reviewers: dsanders, ab, rovka
Reviewed By: dsanders
Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet
Differential Revision: https://reviews.llvm.org/D42012
llvm-svn: 322582
Prior to this we had a separate instruction and register class that excluded eax to prevent matching the instruction that would encode with 0x90.
This patch changes this to just use an InstAlias to force xchgl %eax, %eax to use XCHG32rr instruction in 64-bit mode. This gets rid of the separate instruction and register class.
llvm-svn: 322532
Summary:
This extends TableGen's AsmMatcherEmitter with code that generates
a table with tied-operand constraints. The constraints are checked
when parsing the instruction. If an operand is not equal to its tied operand,
the assembler will give an error.
Patch [2/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB.
Reviewers: olista01, rengolin, mcrosier, fhahn, craig.topper, evandro, echristo
Reviewed By: fhahn
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D41446
llvm-svn: 322166
This patch improves diagnostic for case when mapped instruction
does not contain a field listed under RowFields.
Differential Revision: https://reviews.llvm.org/D41778
llvm-svn: 322004
This change deals with intrinsics with multiple outputs, for example load
instrinsic with address updated.
DAG selection for Instrinsics could be done either through source code or
tablegen. Handling all intrinsics in source code would introduce a huge chunk
of repetitive code if we have a large number of intrinsic that return multiple
values (see NVPTX as an example). While intrinsic class in tablegen supports
multiple outputs, tablegen only supports Intrinsics with zero or one output on
TreePattern. This appears to be a simple bug in tablegen that is fixed by this
change.
For Intrinsics defined as:
def int_xxx_load_addr_updated: Intrinsic<[llvm_i32_ty, llvm_ptr_ty], [llvm_ptr_ty, llvm_i32_ty], []>;
Instruction will be defined as:
def L32_X: Inst<(outs reg:$d1, reg:$d2), (ins reg:$s1, reg:$s2), "ld32_x $d1, $d2, $s2", [(set i32:$d1, i32:$d2, (int_xxx_load_addr_updated i32:$s1, i32:$s2))]>;
Patch by Wenbo Sun, thanks!
Differential Revision: https://reviews.llvm.org/D32888
llvm-svn: 321704
Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.
Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.
llvm-svn: 321212
NFC for currently supported targets. This resolves a problem encountered by
targets such as RISCV that reference `Subtarget` in ImmLeaf predicates.
llvm-svn: 321176
This patch resubmits the SVE ZIP1/ZIP2 patch series consisting of
of r320992, r320986, r320973, and r320970 by reverting
https://reviews.llvm.org/rL321024.
The issue that caused r321024 has been addressed in https://reviews.llvm.org/rL321158,
so this patch-series should be safe to resubmit.
llvm-svn: 321163
Between the creation of the last InstructionMatcher and the first
emission of the related Rule, we need to clear the internal map of IDs.
We used to do that right after the creation of the main
InstructionMatcher when building the rule and although that worked, this
is fragile because if for some reason some later code decides to create
more InstructionMatcher before the final call to emit, then the IDs
would be completely messed up.
Move that to the beginning of "emit" so that the IDs are guarantee to be
consistent.
NFC.
llvm-svn: 321053
Move InsnVarID and OpIdx at the beginning of the list of arguments
for all the constructors of the OperandMatcher subclasses.
This matches what we do for the InstructionMatcher.
NFC.
llvm-svn: 321031
In theory, reapplying optimizeRules on each group matchers should give
us a second nesting level on the matching table. In practice, we need
more work to make that happen because all the predicates are actually
not directly available through the predicate matchers list.
NFC.
llvm-svn: 321025
This reverts changes r320992, r320986, r320973, and r320970.
r320970 by itself breaks the test case, and the rest depend on it.
Test case will land soon.
llvm-svn: 321024
*** Context ***
Prior to this patchw, the table generated for matching instruction was
straight forward but highly inefficient.
Basically, each pattern generates its own set of self contained checks
and actions.
E.g., TableGen generated:
// First pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDrr
// Second pattern
CheckNumOperand 3
CheckOpcode G_ADD
...
Build ADDri
// Third pattern
CheckNumOperand 3
CheckOpcode G_SUB
...
Build SUBrr
*** Problem ***
Because of that generation, a *lot* of check were redundant between each
pattern and were checked every single time until we reach the pattern
that matches.
E.g., Taking the previous table, let say we are matching a G_SUB, that
means we were going to check all the rules for G_ADD before looking at
the G_SUB rule. In particular we are going to do:
check 3 operands; PASS
check G_ADD; FAIL
; Next rule
check 3 operands; PASS (but we already knew that!)
check G_ADD; FAIL (well it is still not true)
; Next rule
check 3 operands; PASS (really!!)
check G_SUB; PASS (at last :P)
*** Proposed Solution ***
This patch introduces a concept of group of rules (GroupMatcher) that
share some predicates and only get checked once for the whole group.
This patch only creates groups with one nesting level. Conceptually
there is nothing preventing us for having deeper nest level. However,
the current implementation is not smart enough to share the recording
(aka capturing) of values. That limits its ability to do more sharing.
For the given example the current patch will generate:
// First group
CheckOpcode G_ADD
// First pattern
CheckNumOperand 3
...
Build ADDrr
// Second pattern
CheckNumOperand 3
...
Build ADDri
// Second group
CheckOpcode G_SUB
// Third pattern
CheckNumOperand 3
...
Build SUBrr
But if we allowed several nesting level, it could create a sub group
for the checknumoperand 3.
(We would need to call optimizeRules on the rules within a group.)
*** Result ***
With only one level of nesting, the instruction selection pass is up
to 4x faster. For instance, one instruction now takes 500 checks,
instead of 24k! With more nesting we could get in the tens I believe.
Differential Revision: https://reviews.llvm.org/D39034
rdar://problem/34670699
llvm-svn: 321017
Summary: Patch [4/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. This patch further improves diagnostic messages for when the SVE feature is not specified.
Reviewers: rengolin, fhahn, olista01, echristo, efriedma
Reviewed By: fhahn
Subscribers: sdardis, aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D40363
llvm-svn: 320992
Summary:
When emitting a diagnostic for an invalid operand, a specific diagnostic
should only be reported when the instruction being matched is actually
enabled by the feature flags.
Patch [3/4] in a series to add parsing of predicates and properly parse SVE
ZIP1/ZIP2 instructions. This patch fixes bogus diagnostic messages for when
the SVE feature is not specified.
Reviewers: rengolin, craig.topper, olista01, sdardis, stoklund
Reviewed By: olista01, sdardis
Subscribers: fhahn, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D40362
llvm-svn: 320986
Prior to this patch, a predicate wouldn't make sense outside of its
rule. Indeed, it was only during emitting a rule that a predicate would
be made aware of the IDs of the data it is checking. Because of that,
predicates could not be moved around or compared between each other.
NFC.
llvm-svn: 320887
Summary:
The generated diagnostic by the AsmMatcher isn't always applicable to the AsmOperand.
This is because the code will only update the diagnostic if it is more
specific than the previous diagnostic. However, when having validated
operands and 'moved on' to a next operand (for some instruction/alias for
which all previous operands are valid), if the diagnostic is InvalidOperand,
than that should be set as the diagnostic, not the more specific message
about a previous operand for some other instruction/alias candidate.
(Re-committed with an extra whitespace in SVEInstrFormats.td to trigger rebuild
of AArch64GenAsmMatcher.inc, since the llvm-clang-x86_64-expensive-checks-win
builder does not seem to rebuild AArch64GenAsmMatcher.inc with the
newly built TableGen due to a missing dependency somewhere (see:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119555.html))
Reviewers: craig.topper, olista01, rengolin, stoklund
Reviewed By: olista01
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D40011
llvm-svn: 320711
Most of the targets don't need the scheduler class enum.
I have an X86 scheduler model change that causes some names in the enum to become about 18000 characters long. This is because using instregex in scheduler models causes the scheduler class to get named with every instruction that matches the regex concatenated together. MSVC has a limit of 4096 characters for an identifier name. Rather than trying to come up with way to reduce the name length, I'm just going to sidestep the problem by not including the enum in X86.
llvm-svn: 320552
A number of architectures re-use the same register names (e.g. for both 32-bit
FPRs and 64-bit FPRs). They are currently unable to use the tablegen'erated
MatchRegisterName and MatchRegisterAltName, as tablegen (when built with
asserts enabled) will fail.
When the AllowDuplicateRegisterNames in AsmParser is set, duplicated register
names will be tolerated. A backend can then coerce registers to the desired
register class by (for instance) implementing validateTargetOperandClass.
At least the in-tree Sparc backend could benefit from this, as does RISC-V
(single and double precision floating point registers).
Differential Revision: https://reviews.llvm.org/D39845
llvm-svn: 320018
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.
All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.
There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
(G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
(G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.
llvm-svn: 319691
This is causing a failure in the llvm-clang-x86_64-expensive-checks-win
buildbot, and I can't reproduce it locally, so reverting until I can work out
what is wrong.
llvm-svn: 319654
This adds a "invalid operands for instruction" diagnostic for
instructions where there is an instruction encoding with the correct
mnemonic and which is available for this target, but where multiple
operands do not match those which were provided. This makes it clear
that there is some combination of operands that is valid for the current
target, which the default diagnostic of "invalid instruction" does not.
Since this is a very general error, we only emit it if we don't have a
more specific error.
Differential revision: https://reviews.llvm.org/D36747
llvm-svn: 319649
GIM_CheckNonAtomic has been replaced by GIM_CheckAtomicOrdering to allow it to support a wider
range of orderings. This has then been used to import patterns using nodes such
as atomic_cmp_swap, atomic_swap, and atomic_load_*.
llvm-svn: 319232
RecordKeeper::getDef() is a hot place, it shows up in profiling
and it creates std::string instance for each search in RecordMap
though RecordKeeper::RecordMap can use StringRef as a key
instead to avoid that. Patch do that change.
Differential revision: https://reviews.llvm.org/D40170
llvm-svn: 318822
When searching for a resource unit, use the reference location instead of
the definition location in case of an error.
Differential revision: https://reviews.llvm.org/D40263
llvm-svn: 318803
- We can still emit this error if the actual instruction has two or more
operands missing compared to the expected one.
- We should only emit this error once per instruction.
Differential revision: https://reviews.llvm.org/D36746
llvm-svn: 318770
This is NFC, as the matcher would continue looping up to the maximum
number of operands with no effect, but this should improve performance a
bit, and makes the debug trace clearer.
Differential revision: https://reviews.llvm.org/D36744
llvm-svn: 318769
Summary:
The generated diagnostic by the AsmMatcher isn't always applicable to the AsmOperand.
This is because the code will only update the diagnostic if it is more specific than the previous diagnostic. However, when having validated operands and 'moved on' to a next operand (for some instruction/alias for which all previous operands are valid), if the diagnostic is InvalidOperand, than that should be set as the diagnostic, not the more specific message about a previous operand for some other instruction/alias candidate.
Reviewers: craig.topper, olista01, rengolin, stoklund
Reviewed By: olista01
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D40011
llvm-svn: 318759
Summary:
This patch fixes an issue so that the right alias is printed when the instruction has tied operands. It checks the number of operands in the resulting instruction as opposed to the alias, and then skips over tied operands that should not be printed in the alias.
This allows to generate the preferred assembly syntax for the AArch64 'ins' instruction, which should always be displayed as 'mov' according to the ARM Architecture Reference Manual. Several unit tests have changed as a result, but only to reflect the preferred disassembly. Some other InstAlias patterns (movk/bic/orr) needed a slight adjustment to stop them becoming the default and breaking other unit tests.
Please note that the patch is mostly the same as https://reviews.llvm.org/D29219 which was reverted because of an issue found when running TableGen with the Address Sanitizer. That issue has been addressed in this iteration of the patch.
Reviewers: rengolin, stoklund, huntergr, SjoerdMeijer, rovka
Reviewed By: rengolin, SjoerdMeijer
Subscribers: fhahn, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D40030
llvm-svn: 318650
ptypeN is functionally the same as typeN except that it informs the
SelectionDAG importer that an operand should be treated as a pointer even
if it was written as iN. This is important for patterns that use iN instead
of iPTR to represent pointers. E.g.:
(set GPR64:$dst, (load GPR64:$addr))
Previously, this was handled as a hardcoded special case for the appropriate
operands to G_LOAD and G_STORE.
llvm-svn: 318574
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.
This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.
Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler
Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
step due to a lack of a portable 'cat' command. It should be the
concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
changes
Depends on D39742
Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka
Reviewed By: rovka
Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D39747
llvm-svn: 318356
This is a tablegen backend to generate documentation for the opcodes that exist
for each target. For each opcode, it lists the assembly string, the names and
types of all operands, and the flags and predicates that apply to the opcode.
Differential revision: https://reviews.llvm.org/D31025
llvm-svn: 318155
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common atomic predicates related to
ordering into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
llvm-svn: 318102
Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common atomic predicates related to
memory type into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
llvm-svn: 318095
Some alias instructions are printed with an extra space after the tab
character. Fix this by skipping that space when the tab character is printed
so that the instructions are aligned with the rest of the code.
Patch by Milos Stojanovic.
Differential Revision: https://reviews.llvm.org/D35946
llvm-svn: 318059
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
(sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
(sext:i32 (load:i16 $ptr)<<unindexedload>>)
I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.
Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.
llvm-svn: 317971
This patch adds the ability to include the member function declarations
in the instruction selector class separately from the member bodies.
Defining GET_DAGISEL_DECL macro to any value will only include the member
declarations. To include bodies, define GET_DAGISEL_BODY macro to be the
selector class name. Example:
class FooDAGToDAGISel : public SelectionDAGISel {
// Pull in declarations only.
#define GET_DAGISEL_DECL
#include "FooISelDAGToDAG.inc"
};
// Include the function bodies (with names qualified with the provided
// class name).
#define GET_DAGISEL_BODY FooDAGToDAGISel
#include "FooISelDAGToDAG.inc"
When neither of the two macros are defined, the function bodies are emitted
inline (in the same way as before this patch).
Differential Revision: https://reviews.llvm.org/D39596
llvm-svn: 317903
Patch [1/5] in a series to add assembler/disassembler support for AArch64 SVE
unpredicated ADD/SUB instructions.
Patch by Sander De Smalen.
Reviewed by: rengolin
Differential Revision: https://reviews.llvm.org/D39087
llvm-svn: 317564
The GlobalISel TableGen backend didn't check for predicates on the
source children. This caused it to generate code for ARM patterns such
as SMLABB or similar, but without properly checking for the sext_16_node
part of the operands. This in turn meant that we would select SMLABB
instead of MLA for simple sequences such as s32 + s32 * s32, which is
wrong (we want a MLA on the full operands, not just their bottom 16
bits).
This patch forces TableGen to skip patterns with predicates on the src
children, so it doesn't generate code for SMLABB and other similar ARM
instructions at all anymore. AArch64 and X86 are not affected.
Differential Revision: https://reviews.llvm.org/D39554
llvm-svn: 317313
This will enable us to prefer VALIGND/Q during shuffle lowering in order to get the extended register encoding space when BWI isn't available. But if we end up not using the extended registers we can switch VPALIGNR for the shorter VEX encoding.
Differential Revision: https://reviews.llvm.org/D39401
llvm-svn: 317122
The importer will now accept nested instructions in the result pattern such as
(ADDWrr $a, (SUBWrr $b, $c)). This is only valid when the nested instruction
def's a single vreg and the parent instruction consumes a single vreg where a
nested instruction is specified. The importer will automatically create a vreg
to connect the two using the type information from the pattern. This vreg will
be constrained to the register classes given in the instruction definitions*.
* REG_SEQUENCE is explicitly rejected because of this. The definition doesn't
constrain to a register class and it therefore needs special handling.
llvm-svn: 317117
The next commit will add support for multi-instruction emission so we need to
start allocating instruction ID's instead of hard-coding them to 0.
llvm-svn: 317057
Multi-instruction emission needs to ensure the the instructions are generated
a depth-first fashion. For example:
(ADDWrr (SUBWrr a, b), c)
needs to emit the SUBWrr before the ADDWrr. However, our walk over
TreePatternNode's is highly context sensitive which makes it difficult to append
BuildMIActions in the order we want. To fix this, we now keep track of the
insertion point as we add actions. This will allow multi-insn emission to insert
BuildMI's in the correct place.
The previous commit failed on the Ubuntu bots using GCC 4.8. These bots lack the
const_iterator forms of insert() and emplace() that were added in C++11. As a
result I've switched the const_iterators to iterators.
llvm-svn: 317049
The same bots fail but I believe I know what the issue is now. These bots are
missing the const_iterator versions of insert/emplace/etc. that were introduced
in C++11.
llvm-svn: 317042
Multi-instruction emission needs to ensure the the instructions are generated
a depth-first fashion. For example:
(ADDWrr (SUBWrr a, b), c)
needs to emit the SUBWrr before the ADDWrr. However, our walk over
TreePatternNode's is highly context sensitive which makes it difficult to append
BuildMIActions in the order we want. To fix this, we now keep track of the
insertion point as we add actions. This will allow multi-insn emission to insert
BuildMI's in the correct place.
The previous commit failed on the Ubuntu bots using GCC 4.8. These bots didn't
like a call to emplace(). I've replaced it with insert() to see if it's a quirk
of the C++11 support.
llvm-svn: 317040
Multi-instruction emission needs to ensure the the instructions are generated
a depth-first fashion. For example:
(ADDWrr (SUBWrr a, b), c)
needs to emit the SUBWrr before the ADDWrr. However, our walk over
TreePatternNode's is highly context sensitive which makes it difficult to append
BuildMIActions in the order we want. To fix this, we now keep track of the
insertion point as we add actions. This will allow multi-insn emission to insert
BuildMI's in the correct place.
llvm-svn: 317029
Multi-instruction emission will require that we have separate handling for
the defs between the implicitly created temporaries and the rule outputs.
The former require new temporary vregs while the latter should copy existing
operands. Factor out the implicit def/use renderers to minimize the code
duplication when we implement that.
llvm-svn: 317025
Prepare for multiple instruction emission by allowing BuildMIAction to
search for a suitable matcher that will support mutation.
This patch deliberately neglects to add matchers aside from the root to
preserve NFC. That said, it should be noted that until we support mutations
other than just the opcode the chances of finding a non-root instruction
for which canMutate() is true, is essentially zero. Furthermore in the
presence of multi-instruction emission the chances of finding any
instruction for which canMutate() is true is also zero. Nevertheless, we
can't continue to require that all BuildMIAction's consider the root of the match
to be recyclable due to the risk of recycling it twice in the same rule.
llvm-svn: 317022
When multi-instruction emission is supported, it will no longer be guaranteed
that every BuildMIAction has a corresponding matched instruction. BuildMIAction
should support not having one to cover the case where a rule produces more
instructions than it matched.
llvm-svn: 316463
This patch enables the import of stores. Unfortunately, doing so by itself,
loses an optimization where storing 0 to memory makes use of WZR/XZR.
To mitigate this, this patch also introduces a new feature that allows register
operands to nominate a zero register. When this is done, GlobalISel will
substitute (G_CONSTANT 0) with the nominated register automatically. This
is currently configured to only apply to the stores.
Applying it to GPR32/GPR64 register classes in general will be done after
review see (https://reviews.llvm.org/D39150).
llvm-svn: 316360
This is similar to how we generate the VEX tables.
More fixes are still needed for the instructions that use EVEX.b (broadcast and embedded rounding).
llvm-svn: 316294
This introduces a new operand type to encode the whether the index register should be XMM/YMM/ZMM. And new code to fixup the results created by readSIB.
This has the nice effect of removing a bunch of code that hard coded the name of every GATHER and SCATTER instruction to map the index type.
This fixes PR32807.
llvm-svn: 316273
MSVC doesn't seem to like implicitly instantiating addPredicate and then
explicitly specializing it later. It causes an internal compiler error.
llvm-svn: 315930
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.
At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.
The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.
Depends on D37457
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37458
llvm-svn: 315887
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.
At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.
Depends on D37457
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37458
llvm-svn: 315885
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
special-case. I've yet to find a reasonable way to implement it.
select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.
Depends on D37456
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37457
llvm-svn: 315884
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.
This patch adds support for this in order to enable the import of load/store
patterns.
Depends on D37445
Hopefully fixed the ambiguous constructor that a large number of bots reported.
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37456
llvm-svn: 315869
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.
This patch adds support for this in order to enable the import of load/store
patterns.
Depends on D37445
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37456
llvm-svn: 315863
In type inference, an empty type set for a specific hw mode is not an
error. In earlier stages of the design it was, but having to use non-
parameterized types with target intrinsics necessarily led to type
contradictions: since the intrinsics used specific types, they were
only valid for a specific hw mode, and the resulting type set for other
modes ended up empty. To accommodate the existence of such intrinsics
individual type sets were allowed to be empty as long as not all sets
were empty.
llvm-svn: 315858
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.
This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.
Depends on D37443
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37445
llvm-svn: 315843
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
Depends on D36618
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37443
Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().
llvm-svn: 315841
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.
Depends on D36569
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: aemerson, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36618
llvm-svn: 315780
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36569
llvm-svn: 315761
Summary:
The purpose of this patch is to expose more information about ImmLeaf-like
PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf
could only be used to test int64_t's produced by sign-extending an APInt.
Other tests on immediates had to use the generic PatLeaf and extract the
constant using C++.
With this patch, tablegen will know how to generate predicates for APInt,
and APFloat. This will allow it to 'do the right thing' for both SelectionDAG
and GlobalISel which require different methods of extracting the immediate
from the IR.
This is NFC for SelectionDAG since the new code is equivalent to the
previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1
for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new
subclasses will require a significant re-factor of FastISel.
For GlobalISel, it's currently NFC because the relevant code to import the
affected rules is not yet present. This will be added in a later patch.
Depends on D36086
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36534
llvm-svn: 315747
I'm about to commit a patch that makes them necessary for getPredCode() and
it would be strange for getPredCode() and getImmCode() to require different
usage.
llvm-svn: 315733
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.
Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.
Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.
Differential Revision: https://reviews.llvm.org/D38406
llvm-svn: 315590
This reverts commit 4e4ee1c507e2707bb3c208e1e1b6551c3015cbf5.
This is failing due to some code that isn't built on MSVC
so I didn't catch. Not immediately obvious how to fix this
at first glance, so I'm reverting for now.
llvm-svn: 315536
There's a lot of misuse of Twine scattered around LLVM. This
ranges in severity from benign (returning a Twine from a function
by value that is just a string literal) to pretty sketchy (storing
a Twine by value in a class). While there are some uses for
copying Twines, most of the very compelling ones are confined
to the Twine class implementation itself, and other uses are
either dubious or easily worked around.
This patch makes Twine's copy constructor private, and fixes up
all callsites.
Differential Revision: https://reviews.llvm.org/D38767
llvm-svn: 315530
This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.
The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.
llvm-svn: 315445
This allows a DiagnosticType and/or DiagnosticString to be associated
with a RegisterClass in tablegen, so that we can emit diagnostics in the
assembler when a register operand is incorrect.
DiagnosticType creates a predictable enum value, which gets returned as
the error code when an operand does not match, and can be used by the
assembly parser to map to a user-facing diagnostic. DiagnosticString
creates an anonymous enum value (currently based on the tablegen class
name), and a function to map from enum values to strings will be
generated. Both of these work the same was as they do for AsmOperand.
This isn't used by any targets yet, but has one (positive) side-effect.
It improves the diagnostic codes returned by validateOperandClass - we
always want to emit the diagnostic that relates to the expected operand
class, but this wasn't always being done when the expected and actual
classes were completely different (token/register/custom). This causes a
few AArch64 diagnostics to be improved, as Match_InvalidOperand was
being returned instead of a specific diagnostic type.
Differential revision: https://reviews.llvm.org/D36691
llvm-svn: 315295
It's rare but there are a small number of patterns like this:
(set i64:$dst, (add i64:$src1, i64:$src2))
These should be equivalent to register classes except they shouldn't check for
a specific register bank.
This doesn't occur in AArch64/ARM/X86 but does occasionally come up in other
in-tree targets such as BPF.
llvm-svn: 315226
After the original commit ([[ https://reviews.llvm.org/rL304088 | rL304088 ]]) was reverted, a discussion in llvm-dev was opened on 'how to accomplish this task'.
In the discussion we concluded that the best way to achieve our goal (which is to automate the folding tables and remove the manually maintained tables) is:
# Commit the tablegen backend disabled by default.
# Proceed with an incremental updating of the manual tables - while checking the validity of each added entry.
# Repeat previous step until we reach a state where the generated and the manual tables are identical. Then we can safely remove the manual tables and include the generated tables instead.
# Schedule periodical (1 week/2 weeks/1 month) runs of the pass:
- if changes appear (new entries):
- make sure the entries are legal
- If they are not, mark them as illegal to folding
- Commit the changes (if there are any).
CMake flag added for this purpose is "X86_GEN_FOLD_TABLES". Building with this flags will run the pass and emit the X86GenFoldTables.inc file under build/lib/Target/X86/ directory which is a good reference for any developer who wants to take part in the effort of completing the current folding tables.
Differential Revision: https://reviews.llvm.org/D38028
llvm-svn: 315173
The assertion tests were using count() instead of testing the find result, resulting in double the number of searches in debug/assert builds.
Instead, call find once (like the release builds do) and assert the result against end().
llvm-svn: 315151
Avoid unnecessary std::string creations in the TreePredicateFn getters and in CodeGenDAGPatterns::getSDNodeNamed
Differential Revision: https://reviews.llvm.org/D38624
llvm-svn: 315148
This adds a DiagnosticString member to the AsmOperand tablegen class, so
that the diagnostic text to be used when an assembly operand is
incorrect can be stored in the tablegen description of the operand,
rather than in a separate switch statement in the AsmParser.
If DiagnosticString is used for any operands, tablegen will emit a
getMatchKindDiag function, to map from diagnostic enums to strings.
Differential revision: https://reviews.llvm.org/D31606
llvm-svn: 314803
The current table-generated assembly instruction matcher returns a
64-bit error code when matching fails. Since multiple instruction
encodings with the same mnemonic can fail for different reasons, it uses
some heuristics to decide which message is important.
This heuristic does not work well for targets that have many encodings
with the same mnemonic but different operands, or which have different
versions of instructions controlled by subtarget features, as it is hard
to know which encoding the user was intending to use.
Instead of trying to improve the heuristic in the table-generated
matcher, this patch changes it to report a list of near-miss encodings.
This list contains an entry for each encoding with the correct mnemonic,
but with exactly one thing preventing it from being valid. This thing
could be a single invalid operand, a missing target feature or a failed
target-specific validation function.
The target-specific assembly parser can then report an error message
giving multiple options for instruction variants that the user may have
been trying to use. For example, I am working on a patch to use this for
ARM, which can give this error for an invalid instruction for ARMv6-M:
<stdin>:8:3: error: invalid instruction, multiple near-miss encodings found
adds r0, r1, #0x8
^
<stdin>:8:3: note: for one encoding: instruction requires: thumb2
adds r0, r1, #0x8
^
<stdin>:8:16: note: for one encoding: expected an integer in range [0, 7]
adds r0, r1, #0x8
^
<stdin>:8:16: note: for one encoding: expected a register in range [r0, r7]
adds r0, r1, #0x8
^
This also allows the target-specific assembly parser to apply its own
heuristics to suppress some errors. For example, the error "instruction
requires: arm-mode" is never going to be useful when targeting an
M-profile architecture (which does not have ARM mode).
This patch just adds the target-independent mechanism for doing this,
all targets still use the old mechanism. I've added a bit in the
AsmParser tablegen class to allow targets to switch to this new
mechanism. To use this, the target-specific assembly parser will have to
be modified for the change in signature of MatchInstructionImpl, and to
report errors based on the list of near-misses.
Differential revision: https://reviews.llvm.org/D27620
llvm-svn: 314774
Also add operator<< for use with raw_ostream to InfoByHwMode and its
derived classes.
Recommitting r313989 with the fix for unresolved references: explicitly
define the operator<< in namespace llvm.
llvm-svn: 314004
Avoid unnecessary std::string creations during TypeSetByHwMode::writeToStream.
Found during investigations into PR28222
Differential Revision: https://reviews.llvm.org/D38174
llvm-svn: 313983
This changes some STL data types to corresponding LLVM
data types that have better performance characteristics.
Differential Revision: https://reviews.llvm.org/D37957
llvm-svn: 313783
The generated DAG isel file currently makes use of formatted_raw_ostream primarily for generating a hierarchical representation while also skipping over the initial comment that contains the current index.
It was reported in D37957 that this formatting might be slow due to the need to keep track of column numbers by monitoring all the written data for new lines.
This patch attempts to rewrite the emitter to make use of simpler formatting mechanisms to generate a fairly similar output. The main difference is that the number in the index comment is now right justified and padded with spaces inside the comment. Previously we appended the spaces after the comment.
Differential Revision: https://reviews.llvm.org/D37966
llvm-svn: 313674
Add some member types to MachineValueTypeSet::const_iterator so that
iterator_traits can work with it.
Improve TableGen performance of -gen-dag-isel (motivated by X86 backend)
The introduction of parameterized register classes in r313271 caused the
matcher generation code in TableGen to run much slower, particularly so
in the unoptimized (debug) build. This patch recovers some of the lost
performance.
Summary of changes:
- Cache the set of legal types in TypeInfer::getLegalTypes. The contents
of this set do not change.
- Add LLVM_ATTRIBUTE_ALWAYS_INLINE to several small functions. Normally
this would not be necessary, but in the debug build TableGen is not
optimized, so this helps a little bit.
- Add an early exit from TypeSetByHwMode::operator== for the case when
one or both arguments are "simple", i.e. only have one mode. This
saves some time in GenerateVariants.
- Finally, replace the underlying storage type in TypeSetByHwMode::SetType
with MachineValueTypeSet based on std::array instead of std::set.
This significantly reduces the number of memory allocation calls.
I've done a number of experiments with the underlying type of InfoByHwMode.
The type is a map, and for targets that do not use the parameterization,
this map has only one entry. The best (unoptimized) performance, somewhat
surprisingly came from std::map, followed closely by std::unordered_map.
DenseMap was the slowest by a large margin.
Various hand-crafted solutions (emulating enough of the map interface
not to make sweeping changes to the users) did not yield any observable
improvements.
llvm-svn: 313660
The introduction of parameterized register classes in r313271 caused the
matcher generation code in TableGen to run much slower, particularly so
in the unoptimized (debug) build. This patch recovers some of the lost
performance.
Summary of changes:
- Cache the set of legal types in TypeInfer::getLegalTypes. The contents
of this set do not change.
- Add LLVM_ATTRIBUTE_ALWAYS_INLINE to several small functions. Normally
this would not be necessary, but in the debug build TableGen is not
optimized, so this helps a little bit.
- Add an early exit from TypeSetByHwMode::operator== for the case when
one or both arguments are "simple", i.e. only have one mode. This
saves some time in GenerateVariants.
- Finally, replace the underlying storage type in TypeSetByHwMode::SetType
with MachineValueTypeSet based on std::array instead of std::set.
This significantly reduces the number of memory allocation calls.
I've done a number of experiments with the underlying type of InfoByHwMode.
The type is a map, and for targets that do not use the parameterization,
this map has only one entry. The best (unoptimized) performance, somewhat
surprisingly came from std::map, followed closely by std::unordered_map.
DenseMap was the slowest by a large margin.
Various hand-crafted solutions (emulating enough of the map interface
not to make sweeping changes to the users) did not yield any observable
improvements.
llvm-svn: 313647
These are removed in C++17. We still have some users of
unary_function::argument_type, so just spell that typedef out. No
functionality change intended.
Note that many of the argument types are actually wrong :)
llvm-svn: 313287
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.
This affects the way that types and type sets are printed, and the
tests relying on that have been updated.
There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)
For more information, please refer to the review page.
Differential Revision: https://reviews.llvm.org/D31951
llvm-svn: 313271
Summary:
Since asan is linked dynamically on Darwin, the weak interface symbol
is removed by -Wl,-dead_strip.
Reviewers: kcc, compnerd, aaron.ballman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37636
llvm-svn: 312914
Summary:
Tablegen already supports commutable instrinsics with more than 2 operands. There it just assumes the first two operands are commutable.
I plan to use this to improve the generation of FMA patterns in the X86 backend.
Reviewers: aymanmus, zvi, RKSimon, spatel, arsenm
Reviewed By: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: https://reviews.llvm.org/D37430
llvm-svn: 312464
Summary:
Add support for autocompleting values of -std= by including
LangStandards.def. This patch relies on D36782, and is using two-stage
code generation.
Reviewers: v.g.vassilev, teemperor, ruiu
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36820
llvm-svn: 311971
This reverts commit 7c46b80c022e18d43c1fdafb117b0c409c5a6d1e.
r311552 broke lld buildbot because I've changed OptionInfos type from
ArrayRef to vector. However the bug is fixed, so I'll commit this again.
llvm-svn: 311958
This fixes 2 problems in subregister hierarchies with multiple levels
and tuples:
1) For bigger tuples computing secondary subregs would miss 2nd order
effects. In the test case a register like `S10_S11_S12_S13_S14` with D5
= S10_S11, D6 = S12_S13 we would correctly compute sub0 = D5, sub1 = D6
but would miss the fact that we could now form ssub0_ssub1_ssub2_ssub3
(aka sub0_sub1) = D5_D6. This is fixed by changing
computeSecondarySubRegs() to compute a fixpoint.
2) Fixing 1) exposed a problem where TableGen would create multiple
names for effectively the same subregister index. In the test case
the subregister index sub0 is composed from ssub0 and ssub1, and sub1 is
composed from ssub2 and ssub3. TableGen should not create both sub0_sub1
and ssub0_ssub1_ssub2_ssub3 as infered subregister indexes. This changes
the code to build a transitive closure of the subregister components
before forming new concatenated subregister indexes.
This fix was developed for an out of tree target. For the in-tree
targets the only change is in the register information computed for ARM.
There is a slight chance this fixed/improved some register coalescing
around the QQQQ/QQ register classes there but I couldn't see/provoke any
code generation differences.
Differential Revision: https://reviews.llvm.org/D36913
llvm-svn: 311914
Adds a new --gen-register-info-debug-dump mode to tablegen that dumps various register related information:
- List of register classes with super and subclasses
- List of subregister indexes with lanemasks
- List of registers with subregisters
I will use this in an upcoming commit to create a test.
It may also be useful for target developers wanting to get an overview
of all the register related information, esp. the things inferred by
tablegen and not directly visible in the .td file.
Differential Revision: https://reviews.llvm.org/D36911
llvm-svn: 311913
This fixes a warning when there are zero defined predicates and also fixes an
unnoticed bug where the first predicate in the table was unusable.
llvm-svn: 311684
Summary:
This patch adds support for predicates on imm nodes but only for ImmLeaf and not
for PatLeaf or PatFrag and only where the value does not need to be transformed
before being rendered into the instruction.
The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the
necessary target-supplied C++ for GlobalISel.
Depends on D36085
The previous commit was reverted for breaking the build but this appears to have
been the recurring problem on the Windows bots with tablegen not being re-run
when llvm-tblgen is changed but the .td's aren't. If it re-occurs then forcing a
build with clean=True should fix it but this string should do this in advance:
Requires a clean build.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36086
llvm-svn: 311645
Summary:
This is a patch for clang autocomplete feature.
It will collect values which -analyzer-checker takes, which is defined in
clang/StaticAnalyzer/Checkers/Checkers.inc, dynamically.
First, from ValuesCode class in Options.td, TableGen will generate C++
code in Options.inc. Options.inc will be included in DriverOptions.cpp, and
calls OptTable's addValues function. addValues function will add second
argument to Option's Values class. Values contains string like "foo,bar,.."
which is handed to Values class
in OptTable.
Reviewers: v.g.vassilev, teemperor, ruiu
Subscribers: hiraditya, cfe-commits
Differential Revision: https://reviews.llvm.org/D36782
llvm-svn: 311552
Summary:
This patch adds support for predicates on imm nodes but only for ImmLeaf and not for PatLeaf or PatFrag and only where the value does not need to be transformed before being rendered into the instruction.
The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the necessary target-supplied C++ for GlobalISel.
Depends on D36085
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36086
llvm-svn: 311546
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.
Depends on D36084
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36085
llvm-svn: 311084
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
The previous commit failed on Windows machines due to a flaw in the sort
predicate which allowed both A < B < C and B == C to be satisfied
simultaneously. The cause of this was some sloppiness in the priority order of
G_CONSTANT instructions compared to other instructions. These had equal priority
because it makes no difference, however there were operands had higher priority
than G_CONSTANT but lower priority than any other instruction. As a result, a
priority order between G_CONSTANT and other instructions must be enforced to
ensure the predicate defines a strict weak order.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 311076
As expected, this failed on the windows bots but the instrumentation showed
something interesting. The ADD8ri and INC8r rules are never directly compared
on the windows machines. That implies that the issue lies in transitivity of
the Compare predicate. I believe I've already verified that but maybe I missed
something.
llvm-svn: 310922
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
The previous commit failed on the windows bots and this one is likely to fail
on those same bots. However, the added instrumentation should reveal a particular
isHigherPriorityThan() evaluation which I'm expecting to expose that
these machines are weighing priority of two rules differently from the
non-windows machines.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 310919
Two of the Windows bots are failing test\CodeGen\X86\GlobalISel\select-inc.mir
which should not have been affected by the change. Reverting while I investigate.
Also reverted r310735 because it builds on r310716.
llvm-svn: 310745
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.
Depends on D36084
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36085
llvm-svn: 310735
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
Depends on D35833
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 310716
Summary:
This patch enables the import of rules containing 'imm' operands that do not
constrain the acceptable values using predicates. Support for ImmLeaf will
arrive in a later patch.
Depends on D35681
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35833
llvm-svn: 310343
Relanding after fixing UB issue with DefaultOffsets.
Consider the following instruction: "inst.eq $dst, $src" where ".eq"
is an optional flag operand. The $src and $dst operands are
registers. If we parse the instruction "inst r0, r1", the flag is not
present and it will be marked in the "OptionalOperandsMask" variable.
After the matching is complete we call the "convertToMCInst" method.
The current implementation works only if the optional operands are at
the end of the array. The "Operands" array looks like [token:"inst",
reg:r0, reg:r1]. The first operand that must be added to the MCInst
is the destination, the r0 register. The "OpIdx" (in the Operands
array) for this register is 2. However, since the flag is not present
in the Operands, the actual index for r0 should be 1. The flag is not
present since we rely on the default value.
This patch removes the "NumDefaults" variable and replaces it with an
array (DefaultsOffset). This array contains an index for each operand
(excluding the mnemonic). At each index, the array contains the
number of optional operands that should be subtracted. For the
previous example, this array looks like this: [0, 1, 1]. When we need
to access the r0 register, we compute its index as 2 -
DefaultsOffset[1] = 1.
Patch by Alexandru Guduleasa!
Reviewers: SamWot, nhaustov, niravd
Reviewed By: niravd
Subscribers: vitalybuka, llvm-commits
Differential Revision: https://reviews.llvm.org/D35998
llvm-svn: 310254
Consider the following instruction: "inst.eq $dst, $src" where ".eq"
is an optional flag operand. The $src and $dst operands are
registers. If we parse the instruction "inst r0, r1", the flag is not
present and it will be marked in the "OptionalOperandsMask" variable.
After the matching is complete we call the "convertToMCInst" method.
The current implementation works only if the optional operands are at
the end of the array. The "Operands" array looks like [token:"inst",
reg:r0, reg:r1]. The first operand that must be added to the MCInst
is the destination, the r0 register. The "OpIdx" (in the Operands
array) for this register is 2. However, since the flag is not present
in the Operands, the actual index for r0 should be 1. The flag is not
present since we rely on the default value.
This patch removes the "NumDefaults" variable and replaces it with an
array (DefaultsOffset). This array contains an index for each operand
(excluding the mnemonic). At each index, the array contains the
number of optional operands that should be subtracted. For the
previous example, this array looks like this: [0, 1, 1]. When we need
to access the r0 register, we compute its index as 2 -
DefaultsOffset[1] = 1.
Patch by Alexandru Guduleasa!
Reviewers: SamWot, nhaustov, niravd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35998
llvm-svn: 309949
Summary:
We only need to merge memory operands for instructions that access
memory. This slightly reduces the number of actions executed.
Reviewers: MatzeB, rovka, dsanders
Reviewed By: dsanders
Subscribers: aemerson, igorb, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D36151
llvm-svn: 309944
Summary:
Fix a bug discovered in an out-of-tree target where memoperands from
pseudo-instructions that weren't part of the match were being merged into the
result instructions as part of GIR_MergeMemOperands.
This bug was caused by a change to the handling of State.MIs between rules when
the state machine tables were fused into a single table. Previously, each rule
would reset State.MIs using State.MIs.resize(1) but this is no longer done, as a
result stale data is occasionally left in some elements of State.MIs. Most
opcodes aren't affected by this but GIR_MergeMemOperands merges all memoperands
from the intructions recorded in State.MIs into the result instruction.
Suppose for example, we processed but rejected the following pattern:
(signextend (load x))
at this point, State.MIs contains the signextend and the load. Now suppose we
process and accept this pattern:
(add x, y)
at this point, State.MIs contains the add as well as the (now irrelevant) load.
When GIR_MergeMemOperands is processed, the memoperands from that irrelevant
load will be merged into the result instruction even though it was not part of
the match.
Bringing back the State.MIs.resize(1) would fix the problem but it would limit
our ability to optimize the table in the future. Instead, this patch fixes the
problem by explicitly stating which instructions should be merged into the result.
There's no direct test case in this commit because a test case would be very brittle.
However, at the time of writing this should fix the failures in
http://green.lab.llvm.org/green/job/Compiler_Verifiers_GlobalISEL/ as well as a
failure in test/CodeGen/ARM/GlobalISel/arm-isel.ll when expensive checks are enabled.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: fhahn, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36094
llvm-svn: 309804
Summary:
Now that we have control flow in place, fuse the per-rule tables into a
single table. This is a compile-time saving at this point. However, this will
also enable the optimization of a table so that similar instructions can be
tested together, reducing the time spent on the matching the code.
This is NFC in terms of externally visible behaviour but some internals have
changed slightly. State.MIs is no longer reset between each rule that is
attempted because it's not necessary to do so. As a consequence of this the
restriction on the order that instructions are added to State.MIs has been
relaxed to only affect recorded instructions that require new elements to be
added to the vector. GIM_RecordInsn can now write to any element from 1 to
State.MIs.size() instead of just State.MIs.size().
The compile-time regressions from the last commit were caused by the ARM target
including a non-const variable (zero_reg) in the table and therefore generating
an initializer for it. That variable is now const.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35681
llvm-svn: 309264
The ARM bots have started failing and while this patch should be an improvement
for these bots, it's also the only suspect in the blamelist. Reverting while
Diana and I investigate the problem.
llvm-svn: 309111
Summary:
Now that we have control flow in place, fuse the per-rule tables into a
single table. This is a compile-time saving at this point. However, this will
also enable the optimization of a table so that similar instructions can be
tested together, reducing the time spent on the matching the code.
This is NFC in terms of externally visible behaviour but some internals have
changed slightly. State.MIs is no longer reset between each rule that is
attempted because it's not necessary to do so. As a consequence of this the
restriction on the order that instructions are added to State.MIs has been
relaxed to only affect recorded instructions that require new elements to be
added to the vector. GIM_RecordInsn can now write to any element from 1 to
State.MIs.size() instead of just State.MIs.size().
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35681
llvm-svn: 309094
Summary:
This will allow us to merge the various sub-tables into a single table. This is a
compile-time saving at this point. However, this will also enable the optimization
of a table so that similar instructions can be tested together, reducing the time
spent on the matching the code.
The bulk of this patch is a mechanical conversion to the new MatchTable object
which is responsible for tracking label definitions and filling in the index of
the jump targets. It is also responsible for nicely formatting the table.
This was necessary to support the new GIM_Try opcode which takes the index to
jump to if the match should fail. This value is unknown during table
construction and is filled in during emission. To support nesting try-blocks
(although we currently don't emit tables with nested try-blocks), GIM_Reject
has been re-introduced to explicitly exit a try-block or fail the overall match
if there are no active try-blocks.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35117
llvm-svn: 308596
One place compared with 32, which I've replaced with LaneBitmask::BitWidth.
The other places are shifts of a constant 1 by a lane number. But if LaneBitmask were to be a larger type than 32-bits like 64-bits, the 1 would need to be 1ULL to do a 64-bit shift. To hide this I've added a LanebitMask::getLane that hides the shift and make sures the 1 is casted to correct type first.
llvm-svn: 308042