Summary:
Also fixup rL371928 for cases that occur on our out-of-tree backend
There were still quite a few intermediate APInts and this caused the
compile time of MCCodeEmitter for our target to jump from 16s up to
~5m40s. This patch, brings it back down to ~17s by eliminating pretty
much all of them using two new APInt functions (extractBitsAsZExtValue(),
insertBits() but with a uint64_t). The exact conditions for eliminating
them is that the field extracted/inserted must be <=64-bit which is
almost always true.
Note: The two new APInt API's assume that APInt::WordSize is at least
64-bit because that means they touch at most 2 APInt words. They
statically assert that's true. It seems very unlikely that someone
is patching it to be smaller so this should be fine.
Reviewers: jmolloy
Reviewed By: jmolloy
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67686
llvm-svn: 372243
Some VLIW instruction sets are Very Long Indeed. Using uint64_t constricts the Inst encoding to 64 bits (naturally).
This change switches CodeEmitter to a mode that uses APInts when Inst's bitwidth is > 64 bits (NFC for existing targets).
When Inst.BitWidth > 64 the prototype changes to:
void TargetMCCodeEmitter::getBinaryCodeForInstr(const MCInst &MI,
SmallVectorImpl<MCFixup> &Fixups,
APInt &Inst,
APInt &Scratch,
const MCSubtargetInfo &STI);
The Inst parameter returns the encoded instruction, the Scratch parameter is used internally for manipulating operands and is exposed so that the underlying storage can be reused between calls to getBinaryCodeForInstr. The goal is to elide any APInt constructions that we can.
Similarly the operand encoding prototype changes to:
getMachineOpValue(const MCInst &MI, const MCOperand &MO, APInt &op, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI);
That is, the operand is passed by reference as APInt rather than returned as uint64_t.
To reiterate, this APInt mode is enabled only when Inst.BitWidth > 64, so this change is NFC for existing targets.
llvm-svn: 371928
The scalar f64 patterns don't work yet because they fail on multiple
results from the unused implicit def of scc in the result bit
operation.
llvm-svn: 371542
This was only using the correct register constraints if this was the
final result instruction. If the extract was a sub instruction of the
result, it would attempt to use GIR_ConstrainSelectedInstOperands on a
COPY, which won't work. Move the handling to
createAndImportSubInstructionRenderer so it works correctly.
I don't fully understand why runOnPattern and
createAndImportSubInstructionRenderer both need to handle these
special cases, and constrain them with slightly different methods. If
I remove the runOnPattern handling, it does break the constraint when
the final result instruction is EXTRACT_SUBREG.
llvm-svn: 371150
This is a special case because one node maps to two different G_
instructions, and the operand order is changed.
This mostly enables G_FCMP for AMDPGPU. G_ICMP is still manually
selected for now since it has the SALU and VALU complication to deal
with.
llvm-svn: 370280
Reuse the logic for INSERT_SUBREG to also import SUBREG_TO_REG patterns.
- Split `inferSuperRegisterClass` into two functions, one which tries to use
an existing TreePatternNode (`inferSuperRegisterClassForNode`), and one that
doesn't. SUBREG_TO_REG doesn't have a node to leverage, which is the cause
for the split.
- Rename GlobalISelEmitterInsertSubreg.td to GlobalISelEmitterSubreg.td and
update it.
- Update impacted tests in the AArch64 and X86 backends.
This is kind of a hit/miss for code size improvements/regressions. E.g. in
add-ext.ll, we now get some identity copies. This isn't really anything the
importer can handle, since it's caused by a later pass introducing the copy for
the sake of correctness.
Differential Revision: https://reviews.llvm.org/D66769
llvm-svn: 370254
I thought `llvm::sort` was stable for some reason but it's not.
Use `llvm::stable_sort` in `CodeGenTarget::getSuperRegForSubReg`.
Original patch: https://reviews.llvm.org/D66498
llvm-svn: 370084
When EXPENSIVE_CHECKS are enabled, GlobalISelEmitterSubreg.td doesn't get
stable output.
Reverting while I debug it.
See: https://reviews.llvm.org/D66498
llvm-svn: 370080
This teaches the importer to handle INSERT_SUBREG instructions.
We were missing patterns involving INSERT_SUBREG in AArch64. It appears in
AArch64InstrInfo.td 107 times, and 14 times in AArch64InstrFormats.td.
To meaningfully import it, the GlobalISelEmitter needs to know how to infer a
super register class for a given register class.
This patch introduces the following:
- `getSuperRegForSubReg`, a function which finds the largest register class
which supports a value type and subregister index
- `inferSuperRegisterClass`, a function which finds the appropriate super
register class for an INSERT_SUBREG'
- `inferRegClassFromPattern`, a function which allows for some trivial
lookthrough into instructions
- `getRegClassFromLeaf`, a helper function which returns the register class for
a leaf `TreePatternNode`
- Support for subregister index operands in `importExplicitUseRenderer`
It also
- Updates tests in each backend which are impacted by the change
- Adds GlobalISelEmitterSubreg.td to test that we import and skip the expected
patterns
As a result of this patch, INSERT_SUBREG patterns in X86 may use the
LOW32_ADDR_ACCESS_RBP register class instead of GR32. This is correct, since the
register class contains the same registers as GR32 (except with the addition of
RBP). So, this also teaches X86 to handle that register class. This is in line
with X86ISelLowering, which treats this as a GR class.
Differential Revision: https://reviews.llvm.org/D66498
llvm-svn: 369973
This requires std::intializer_list to be a literal type, which it is
starting with C++14. The downside is that std::bitset is still not
constexpr-friendly so this change contains a re-implementation of most
of it.
Shrinks clang by ~60k.
llvm-svn: 369847
Overloaded intrinsics can use iPTRAny in used/input operands.
The GlobalISelEmitter doesn't know that these are pointers, so it treats them
as scalars. As a result, these intrinsics can't be imported.
This teaches the GlobalISelEmitter to recognize these as pointers rather than
scalars.
Differential Revision: https://reviews.llvm.org/D65756
llvm-svn: 369455
AMDGPU has some buffer intrinsics which theoretically could use
this. Some of the generated tables include the 3 and 4 element vector
versions of these rounded to 64-bits, which is ambiguous. Add these to
help the table disambiguate these.
Assertion change is for the path odd sized vectors now take for R600.
v3i16 is widened to v4i16, which then needs to be promoted to v4i32.
llvm-svn: 369038
Factor out commonly-used target code from the GlobalISelEmitter tests into
a GlobalISelEmitterCommon.td file. This is tested by the original
GlobalISelEmitter.td test.
This reduces the amount of boilerplate code necessary for tests like this.
Differential Revision: https://reviews.llvm.org/D65777
llvm-svn: 368757
Summary:
The problem:
When an operand had bits explicitly set to "1" (as in the InitValue.td test case attached), the decoder was ignoring those bits, and the DecoderMethod was receiving an input where the bits were still zero.
The solution:
We added an "InitValue" variable that stores the initial value of the operand based on what bits were explicitly initialized to 1 in TableGen code. The generated decoder code then uses that initial value to initialize the "tmp" variable, then calls fieldFromInstruction to read the values for the remaining bits that were left unknown in TableGen.
This is mainly useful when there are variations of an instruction that differ based on what bits are set in the operands, since this change makes it possible to access those bits in a DecoderMethod. The DecoderMethod can use those bits to know how to handle the input.
Patch by Nicolas Guillemot
Reviewers: craig.topper, dsanders, fhahn
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D63741
llvm-svn: 368458
This was causing a bug where non-truncating stores would be selected instead of truncating ones.
Differential Revision: https://reviews.llvm.org/D64845
llvm-svn: 367737
AMDGPU uses some custom code predicates for testing alignments.
I'm still having trouble comprehending the behavior of predicate bits
in the PatFrag hierarchy. Any attempt to abstract these properties
unexpectdly fails to apply them.
llvm-svn: 367373
If an intrinsic is defined without outputs, but having side effects,
it still can be removed completely from the program. This patch makes
TableGen not set Attribute::ReadNone for intrinsics which
are declared with IntrHasSideEffects.
Differential Revision: https://reviews.llvm.org/D64414
llvm-svn: 366312
Rather than an array of std::initializer_list, generate a table of
offsets and a flat array of the operands for getOperandType. This is a
bit more efficient on platforms that don't manage to get the array of
inintializer_lists initialized at link time (I'm looking at you
macOS). It's also quite quite a bit faster to compile.
llvm-svn: 366278
The InstrInfoEmitter outputs an enum called "OperandType" which gives
numerical IDs to each operand type. This patch makes use of this enum
to define a function called "getOperandType", which allows looking up
the type of an operand given its opcode and operand index.
Patch by Nicolas Guillemot. Thanks!
Differential Revision: https://reviews.llvm.org/D63320
llvm-svn: 366274
This was failing to import the AMDGPU truncstore patterns. The
truncating stores from 32-bit to 8/16 were then somehow being
incorrectly selected to a 4-byte store.
A separate check is emitted for the LLT size in comparison to the
specific memory VT, which looks strange to me but makes sense based on
the hierarchy of PatFrags used for the default truncstore PatFrags.
llvm-svn: 366129
Currently AMDGPU uses a CodePatPred to check address spaces from the
MachineMemOperand. Introduce a new first class property so that the
existing patterns can be easily modified to uses the new generated
predicate, which will also be handled for GlobalISel.
I would prefer these to match against the pointer type of the
instruction, but that would be difficult to get working with
SelectionDAG compatbility. This is much easier for now and will avoid
a painful tablegen rewrite for all the loads and stores.
I'm also not sure if there's a better way to encode multiple address
spaces in the table, rather than putting the number to expect.
llvm-svn: 366128
Some out of tree backend require larger vector type. Since maintaining the changes out of tree is difficult due to the many manual changes needed when adding a new type we are adding it even if no backend currently use it.
Differential Revision: https://reviews.llvm.org/D64141
Patch by Thomas Raoux!
llvm-svn: 365274
When a Tablegen instruction description uses `OperandWithDefaultOps`,
isel patterns for that instruction don't have to fill in the default
value for the operand in question. But the flip side is that they
actually //can't// override the defaults even if they want to.
This will be very inconvenient for the Arm backend, when we start
wanting to write isel patterns that generate the many MVE predicated
vector instructions, in the form with predication actually enabled. So
this small Tablegen fix makes it possible to write an isel pattern
either with or without values for a defaulted operand, and have the
default values filled in only if they are not overridden.
If all the defaulted operands come at the end of the instruction's
operand list, there's a natural way to match them up to the arguments
supplied in the pattern: consume pattern arguments until you run out,
then fill in any missing instruction operands with their default
values. But if defaulted and non-defaulted operands are interleaved,
it's less clear what to do. This does happen in existing targets (the
first example I came across was KILLGT, in the AMDGPU/R600 backend),
and of course they expect the previous behaviour (that the default for
those operands is used and a pattern argument is not consumed), so for
backwards compatibility I've stuck with that.
Reviewers: nhaehnle, hfinkel, dmgreen
Subscribers: mehdi_amini, javed.absar, tpr, kristof.beyls, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63814
llvm-svn: 365114
r363233 rewrote a bunch of the Intrin Emitter code, however the new
function to update the arg codes did not properly consider a pointer to
an any. This patch adds that logic.
Differential Revision: https://reviews.llvm.org/D63507
llvm-svn: 364364
This allows using anything that isn't a literal integer as the bounds
for a foreach. Some of the diagnostics aren't perfect, but nobody ever
accused tablegen of having good errors. For example, the existing
wording suggests a bitrange is valid, but as far as I can tell this
has never worked.
Fixes bug 41958.
llvm-svn: 361434
If you have more than one schedule model in your TableGen target
definitions, then the diagnostic "No schedule information for
instruction 'foo'" is rather unhelpful, because it doesn't tell you
_which_ schedule model is missing the necessary information (or, as it
might be, missing the UnsupportedFeatures definition that would stop
it thinking it needed it).
Extended the message to include the name of the schedule model that
it's complaining about.
Reviewers: nhaehnle, hfinkel, javedabsar, efriedma, javed.absar
Reviewed By: javed.absar
Subscribers: javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60559
llvm-svn: 358389
Summary:
```
``!listsplat(a, size)``
A list value that contains the value ``a`` ``size`` times.
Example: ``!listsplat(0, 2)`` results in ``[0, 0]``.
```
I plan to use this in X86ScheduleBdVer2.td for LoadRes handling.
This is a little bit controversial because unlike every other binary operator
the types aren't identical.
Reviewers: stoklund, javed.absar, nhaehnle, craig.topper
Reviewed By: javed.absar
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60367
llvm-svn: 358117
When one mistakenly specifies 'def' instead of using 'defm',
the error message is quite misleading: 'Couldn't find class..'
Instead, it should recommend using defm if the multiclass of
same name exists.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D59294
llvm-svn: 356985
AMDGPU would like to use these MVTs.
Differential Revision: https://reviews.llvm.org/D58901
Change-Id: I6125fea810d7cc62a4b4de3d9904255a1233ae4e
llvm-svn: 356351
These two values correspond to the 'Empty' and 'Tombstone' special
keys defined by DenseMapInfo<int64_t>, which means that neither one
can be used as a key in DenseMap<int64_t, anything>. Hence, if you try
to use either of those values as an int literal, IntInit::get() fails
an assertion when it tries to insert them into its static cache of
int-literal objects.
Fixed by replacing the DenseMap with a std::map, which doesn't intrude
on the space of legal values of the key type.
Reviewers: nhaehnle, hfinkel, javedabsar, efriedma
Reviewed By: efriedma
Subscribers: fhahn, efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59016
llvm-svn: 355900
Currently one can concatenate strings using hash(#),
but not lists, although that would be a natural thing to do.
This patch allows one to write something like:
def : A<!listconcat([1,2], [3,4])>;
simply as :
def : A<[1,2] # [3,4]>;
This was missing feature was highlighted by Nicolai
at FOSDEM talk.
Reviewed by: nhaehnle, hfinkel
Differential Revision: https://reviews.llvm.org/D58895
llvm-svn: 355414
This is a small addition to arithmetic operations that improves
expressiveness of the language.
Differential Revision: https://reviews.llvm.org/D58775
llvm-svn: 355187
Summary:
This adds support for defining patterns for global isel using pointer
types, for example:
def : Pat<(load GPR32:$src),
(p1 (LOAD GPR32:$src))>;
DAGISelEmitter will ignore the pointer information and treat these
types as integers with the same bit-width as the pointer type.
Reviewers: dsanders, rtereshin, arsenm
Reviewed By: arsenm
Subscribers: Petar.Avramovic, wdng, rovka, kristof.beyls, jfb, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57065
llvm-svn: 354510
If we run into a pattern that looks like this:
add
(complex $x, $y)
(complex $x, $z)
We should skip the pattern instead of asserting/doing something unpredictable.
This makes us return an Error in that case, and adds a testcase for skipped
patterns.
Differential Revision: https://reviews.llvm.org/D57980
llvm-svn: 353586
This patch extends TableGen language with !cond operator.
Instead of embedding !if inside !if which can get cumbersome,
one can now use !cond.
Below is an example to convert an integer 'x' into a string:
!cond(!lt(x,0) : "Negative",
!eq(x,0) : "Zero",
!eq(x,1) : "One,
1 : "MoreThanOne")
Reviewed By: hfinkel, simon_tatham, greened
Differential Revision: https://reviews.llvm.org/D55758
llvm-svn: 352185
Summary:
This fixes support in DAGISelMatcher backend for DAG nodes with multiple
result values. Previously the order of results in selected DAG nodes always
matched the order of results in ISel patterns. After the change the order of
results matches the order of operands in OutOperandList instead.
For example, given this definition from the attached test case:
def INSTR : Instruction {
let OutOperandList = (outs GPR:$r1, GPR:$r0);
let InOperandList = (ins GPR:$t0, GPR:$t1);
let Pattern = [(set i32:$r0, i32:$r1, (udivrem i32:$t0, i32:$t1))];
}
the DAGISelMatcher backend currently produces a matcher that creates INSTR
nodes with the first result `$r0` and the second result `$r1`, contrary to the
order in the OutOperandList. The order of operands in OutOperandList does not
matter at all, which is unexpected (and unfortunate) because the order of
results of a DAG node does matters, perhaps a lot.
With this change, if the order in OutOperandList does not match the order in
Pattern, DAGISelMatcherGen emits CompleteMatch opcodes with the order of
results taken from OutOperandList. Backend writers can use it to express
result reorderings in TableGen.
If the order in OutOperandList matches the order in Pattern, the result of
DAGISelMatcherGen is unaffected.
Patch by Eugene Sharygin
Reviewers: andreadb, bjope, hfinkel, RKSimon, craig.topper
Reviewed By: craig.topper
Subscribers: nhaehnle, craig.topper, llvm-commits
Differential Revision: https://reviews.llvm.org/D55055
llvm-svn: 348326
When tablegen detects that there exist two subregister compositions that
result in the same value for some register, it will emit a warning. This
kind of an overlap in compositions should only happen when it is caused
by a user-defined composition. It can happen, however, that the user-
defined composition is not identically equal to another one, but it does
produce the same value for one or more registers. In such cases suppress
the warning.
This patch is to silence the warning when building the System Z backend
after D50725.
Differential Revision: https://reviews.llvm.org/D50977
llvm-svn: 347894
There are quite strong constraints on how you can use the TIED_TO
constraint between MC operands, many of which are currently not
checked until compiler run time.
MachineVerifier enforces that operands can only be tied together in
pairs (no three-way ties), and MachineInstr::tieOperands enforces that
one of the tied operands must be an output operand (def) and the other
must be an input operand (use).
Now we check these at TableGen time, so that if you violate any of
them in a new instruction definition, you find out immediately,
instead of having to wait until you compile something that makes code
generation hit one of those assertions.
Also in this commit, all the error reports in ParseConstraint now
include the name and source location of the def where the problem
happened, so that if you do trigger any of these errors, it's easier
to find the part of your TableGen input where you made the mistake.
The trunk sources already build successfully with this additional
error check, so I think no in-tree target has any of these problems.
Reviewers: fhahn, lhames, nhaehnle, MatzeB
Reviewed By: MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53815
llvm-svn: 347743
Summary:
The issue with the python path is that the path to python on Windows can contain spaces. To make the tests always work, the path to python needs to be surrounded by quotes.
This change updates several configuration files which specify the path to python as a substitution and also remove quotes from existing tests.
Reviewers: asmith, zturner, alexshap, jakehehrlich
Reviewed By: zturner, alexshap, jakehehrlich
Subscribers: mehdi_amini, nemanjai, eraman, kbarton, jakehehrlich, steven_wu, dexonsmith, stella.stamenova, delcypher, llvm-commits
Differential Revision: https://reviews.llvm.org/D50206
llvm-svn: 339073
Summary: The path to the python executable can contain spaces, so it should be specified with quotes.
Reviewers: asmith, simon_tatham
Reviewed By: simon_tatham
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49258
llvm-svn: 337006
The aim of this backend is to output everything TableGen knows about
the record set, similarly to the default -print-records backend. But
where -print-records produces output in TableGen's input syntax
(convenient for humans to read), this backend produces it as
structured JSON data, which is convenient for loading into standard
scripting languages such as Python, in order to extract information
from the data set in an automated way.
The output data contains a JSON representation of the variable
definitions in output 'def' records, and a few pieces of metadata such
as which of those definitions are tagged with the 'field' prefix and
which defs are derived from which classes. It doesn't dump out
absolutely every piece of knowledge it _could_ produce, such as type
information and complicated arithmetic operator nodes in abstract
superclasses; the main aim is to allow consumers of this JSON dump to
essentially act as new backends, and backends don't generally need to
depend on that kind of data.
The new backend is implemented as an EmitJSON() function similar to
all of llvm-tblgen's other EmitFoo functions, except that it lives in
lib/TableGen instead of utils/TableGen on the basis that I'm expecting
to add it to clang-tblgen too in a future patch.
To test it, I've written a Python script that loads the JSON output
and tests properties of it based on comments in the .td source - more
or less like FileCheck, except that the CHECK: lines have Python
expressions after them instead of textual pattern matches.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arichardson, labath, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D46054
llvm-svn: 336771
The vast number of added instructions for SVE causes TableGen to fail with an assertion:
Assertion `Delta < 65536U && "disassembler decoding table too large!"'
This patch increases the number of supported decoder fix-ups.
Reviewers: dmgreen, stoklund, petpav01
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D48937
llvm-svn: 336334
Implements PR34259
Intrinsics.h is a very popular header. Most LLVM TUs care about things
like dbg_value, but they don't care how they are implemented. After I
split these out, IntrinsicImpl.inc is 1.7 MB, so this saves each LLVM TU
from scanning 1.7 MB of source that gets pre-processed away.
It also means we can modify intrinsic properties without triggering a
full rebuild, but that's probably less of a win.
I think the next best thing to do would be to split out the target
intrinsics into their own header. Very, very few TUs care about
target-specific intrinsics. It's very hard to split up the target
independent intrinsics like llvm.expect, assume, and dbg.value, though.
llvm-svn: 335407
Summary:
This is essentially a rewrite of the backend which introduces TableGen
base classes GenericEnum, GenericTable, and SearchIndex. They allow
generating custom enums and tables with lookup functions using
separately defined records as the underlying database.
Also added as part of this change:
- Lookup functions may use indices composed of multiple fields.
- Instruction fields are supported similar to Intrinsic fields.
- When the lookup key has contiguous numeric values, the lookup
function will directly index into the table instead of using a binary
search.
The existing SearchableTable functionality is internally mapped to the
new primitives.
Change-Id: I444f3490fa1dbfb262d7286a1660a2c4308e9932
Reviewers: arsenm, tra, t.p.northover
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D48013
llvm-svn: 335225
Summary:
This also allows inner foreach loops to have a list that depends on
the iteration variable of an outer foreach loop. The test cases show
some very simple examples of how this can be used.
This was perhaps the last remaining major non-orthogonality in the
TableGen frontend.
Change-Id: I79b92d41a5c0e7c03cc8af4000c5e1bda5ef464d
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47431
llvm-svn: 335221
So far, we've only handled special cases of PatFrag like ImmLeaf. This patch
adds support for the remaining cases using similar mechanisms.
Like most C++ code from SelectionDAG, GISel and DAGISel expect to operate on
different types and representations and as such the code is not compatible
between the two. It's therefore necessary to add an alternative implementation
in the GISelPredicateCode field.
The target test for this feature could easily be done with IntImmLeaf and this
would save on a little boilerplate. The reason I've chosen to implement this
using PatFrag.GISelPredicateCode and not IntImmLeaf is because I was unable to
find a rule that was blocked solely by lack of support for PatFrag predicates. I
found that the ones I investigated as being likely candidates for the test
were further blocked by other things.
llvm-svn: 334871
Summary:
The new rules are straightforward. The main rules to keep in mind
are:
1. NAME is an implicit template argument of class and multiclass,
and will be substituted by the name of the instantiating def/defm.
2. The name of a def/defm in a multiclass must contain a reference
to NAME. If such a reference is not present, it is automatically
prepended.
And for some additional subtleties, consider these:
3. defm with no name generates a unique name but has no special
behavior otherwise.
4. def with no name generates an anonymous record, whose name is
unique but undefined. In particular, the name won't contain a
reference to NAME.
Keeping rules 1&2 in mind should allow a predictable behavior of
name resolution that is simple to follow.
The old "rules" were rather surprising: sometimes (but not always),
NAME would correspond to the name of the toplevel defm. They were
also plain bonkers when you pushed them to their limits, as the old
version of the TableGen test case shows.
Having NAME correspond to the name of the toplevel defm introduces
"spooky action at a distance" and breaks composability:
refactoring the upper layers of a hierarchy of nested multiclass
instantiations can cause unexpected breakage by changing the value
of NAME at a lower level of the hierarchy. The new rules don't
suffer from this problem.
Some existing .td files have to be adjusted because they ended up
depending on the details of the old implementation.
Change-Id: I694095231565b30f563e6fd0417b41ee01a12589
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47430
llvm-svn: 333900
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we move register bank checks back from epilogue of
every rule matcher to a position locally close to the rest of the
checks for a particular (nested) instruction.
This increases the number of common conditions within 2nd level
groups.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by about 2% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64 (cross-compile on x86).
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333144
This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we greedily stuff 2nd level GroupMatcher's common
conditions with as many predicates as possible. This is purely
post-processing and it doesn't change which rules are put into the
groups in the first place: that decision is made by looking at the
first common predicate only.
The compile time improvements are minor and well within error margin,
however, it's highly improbable that this transformation could
pessimize performance, thus I'm still committing it for potential
gains for targets not implementing GlobalISel yet and out of tree
targets.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333139
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we sort type checks towards the beginning of every rule
within the MatchTable as they fail often and it's best to fail early.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 7% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64. The amalgamation is a large single-file C-source that makes
compiler backend performance improvements to stand out from frontend.
It's also a part of CTMark.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333114
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we start grouping rules with common first condition on
the second level of the table.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 13% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333053
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we introduce a new matching opcode GIM_SwitchOpcode
that implements a jump table over opcodes and start emitting them for
root instructions.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 20% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64.
To some degree, we assume here that the opcodes form a dense set,
which is true at the moment for all upstream targets given the
limitations of our rule importing mechanism.
It might not be true for out of tree targets, specifically due to
pseudo's. If so, we might noticeably increase the size of the
MatchTable with this patch due to padding zeros. This will be
addressed later.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333017
Some ISA's such as microMIPS32(R6) have instructions which are near identical
for code generation purposes, e.g. xor and xor16. These instructions take the
same value types for operands and return values, have the same
instruction predicates and map to the same ISD opcode. (These instructions do
differ by register classes.)
In such cases, the FastISel generator rejects the instruction definition.
This patch borrows the 'FastIselShouldIgnore' bit from rL129692 and enables
applying it to an instruction definition.
Reviewers: mcrosier
Differential Revision: https://reviews.llvm.org/D46953
llvm-svn: 332983
This patch continues a series of patches that decrease time spent by
GlobalISel in its InstructionSelect pass by roughly 60% for -O0 builds
for large inputs as measured on sqlite3-amalgamation
(http://sqlite.org/download.html) targeting AArch64.
This commit specifically removes number of operands checks that are
redundant if the instruction's opcode already guarantees that number
of operands (or more), and also avoids any kind of checks on a def
operand of a nested instruction as everything about it was already
checked at its use.
The expected performance implication is about 3% off InstructionSelect
comparing to the baseline (before the series of patches)
This patch also contains a bit of NFC changes required for further
patches in the series.
Every commit planned shares the same Phabricator Review.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 332945
Apparently the compile time problem was caused by the fact that not
all compilers / STL implementations can automatically convert
std::unique_ptr<Derived> to std::unique_ptr<Base>. Fixed (hopefully)
by making sure it's std::unique_ptr<Derived>&& (rvalue ref) to
std::unique_ptr<Base> conversion instead.
llvm-svn: 332917
This patch starts a series of patches that decrease time spent by
GlobalISel in its InstructionSelect pass by roughly 60% for -O0 builds
for large inputs as measured on sqlite3-amalgamation
(http://sqlite.org/download.html) targeting AArch64.
The performance improvements are achieved solely by reducing the
number of matching GIM_* opcodes executed by the MatchTable's
interpreter during the selection by approx. a factor of 30, which also
brings contribution of this particular part of the selection process
to the overall runtime of InstructionSelect pass down from approx.
60-70% to 5-7%, thus making further improvements in this particular
direction not very profitable.
The improvements described above are expected for any target that
doesn't have many complex patterns. The targets that do should
strictly benefit from the changes, but by how much exactly is hard to
estimate beforehand. It's also likely that such target WILL benefit
from further improvements to MatchTable, most likely the ones that
bring it closer to a perfect decision tree.
This commit specifically is rather large mostly NFC commit that does
necessary preparation work and refactoring, there will be a following
series of small patches introducing a specific optimization each
shortly after.
This commit specifically is expected to cause a small compile time
regression (around 2.5% of InstructionSelect pass time), which should
be fixed by the next commit of the series.
Every commit planned shares the same Phabricator Review.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 332907
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
improve optimization of extends and truncates, this legality requirement
would spread without considerable care w.r.t when certain combines were
permitted.
* The SelectionDAG importer required some ugly and fragile pattern
rewriting to translate patterns into this style.
This patch changes the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.
Each extending load can be lowered by the legalizer into separate extends
and loads, however a target that supports s1 will need the any-extending
load to extend to at least s8 since LLVM does not represent memory accesses
smaller than 8 bit. The legalizer can widenScalar G_LOAD into an
any-extending load but sign/zero-extending loads need help from something
else like a combiner pass. A follow-up patch that adds combiner helpers for
for this will follow.
The new representation requires that the MMO correctly reflect the memory
access so this has been corrected in a couple tests. I've also moved the
extending loads to their own tests since they are (mostly) separate opcodes
now. Additionally, the re-write appears to have invalidated two tests from
select-with-no-legality-check.mir since the matcher table no longer contains
loads that result in s1's and they aren't legal in AArch64 anymore.
Depends on D45540
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar
Reviewed By: rtereshin
Subscribers: javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D45541
llvm-svn: 331601
The main goal is to share getMatchTable between the Instruction
Selector and the Testgen.
The commit also contains some NFC only loosely related to refactoring
out the getMatchTable, but strongly related to the initial Testgen
patch (see https://reviews.llvm.org/D43962)
Reviewers: dsanders, aemerson
Reviewed By: dsanders
Subscribers: rovka, kristof.beyls, llvm-commits, dsanders
Differential Revision: https://reviews.llvm.org/D46096
llvm-svn: 331395
An input !foreach expression such as !foreach(a, lst, !add(a, 1))
would be re-emitted by llvm-tblgen -print-records with the first
argument in quotes, giving !foreach("a", lst, !add(a, 1)), which isn't
valid TableGen input syntax.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46352
llvm-svn: 331351
Summary:
We will use this in the AMDGPU backend in a subsequent patch
in the stack to lookup target-specific per-intrinsic information.
The generic CodeGenIntrinsic machinery is used to ensure that,
even though we don't calculate actual enum values here, we do
get the intrinsics in the right order for the binary search
index.
Change-Id: If61cd5587963a4c5a1cc53df1e59c5e4dec1f9dc
Reviewers: arsenm, rampitec, b-sumner
Subscribers: wdng, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D44935
llvm-svn: 328937
Summary:
Instantiating def's and defm's needs to perform the following steps:
- for defm's, clone multiclass def prototypes and subsitute template args
- for def's and defm's, add subclass definitions, substituting template
args
- clone the record based on foreach loops and substitute loop iteration
variables
- override record variables based on the global 'let' stack
- resolve the record name (this should be simple, but unfortunately it's
not due to existing .td files relying on rather silly implementation
details)
- for def(m)s in multiclasses, add the unresolved record as a multiclass
prototype
- for top-level def(m)s, resolve all internal variable references and add
them to the record keeper and any active defsets
This change streamlines how we go through these steps, by having both
def's and defm's feed into a single addDef() method that handles foreach,
final resolve, and routing the record to the right place.
This happens to make foreach inside of multiclasses work, as the new
test case demonstrates. Previously, foreach inside multiclasses was not
forbidden by the parser, but it was de facto broken.
Another side effect is that the order of "instantiated from" notes in error
messages is reversed, as the modified test case shows. This is arguably
clearer, since the initial error message ends up pointing directly to
whatever triggered the error, and subsequent notes will point to increasingly
outer layers of multiclasses. This is consistent with how C++ compilers
report nested #includes and nested template instantiations.
Change-Id: Ica146d0db2bc133dd7ed88054371becf24320447
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44478
llvm-svn: 328117
Summary:
Otherwise, patterns like in the test case produce cryptic error
messages about fields being resolved incompletely.
Change-Id: I713c0191f00fe140ad698675803ab1f8823dc5bd
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44476
llvm-svn: 327850
Summary:
The docs already claim that this happens, but so far it hasn't. As a
consequence, existing TableGen files get this wrong a lot, but luckily
the fixes are all reasonably straightforward.
To make this work with all the existing forms of self-references (since
the true type of a record is only built up over time), the lookup of
self-references in !cast is delayed until the final resolving step.
Change-Id: If5923a72a252ba2fbc81a889d59775df0ef31164
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D44475
llvm-svn: 327849
Summary:
These are cases of self-references that exist today in practice. Let's
add tests for them to avoid regressions.
The self-references in PPCInstrInfo.td can be expressed in a simpler
way. Allowing this type of self-reference while at the same time
consistently doing late-resolve even for self-references is problematic
because there are references to fields that aren't in any class. Since
there's no need for this type of self-reference anyway, let's just
remove it.
Change-Id: I914e0b3e1ae7adae33855fac409b536879bc3f62
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: nemanjai, wdng, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D44474
llvm-svn: 327848
Summary:
Cast-from-string for records isn't going away, but cast-from-string for
variables is a pretty dodgy feature to have, especially when referencing
template arguments. It's doubtful that this ever worked in a reliable
way, and nobody seems to be using it, so let's get rid of it and get
some related cleanups.
Change-Id: I395ac8a43fef4cf98e611f2f552300d21e99b66a
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44195
llvm-svn: 327844
Additionally, allow more than two operands to !con, !add, !and, !or
in the same way as is already allowed for !listconcat and !strconcat.
Change-Id: I9659411f554201b90cd8ed7c7e004d381a66fa93
Differential revision: https://reviews.llvm.org/D44112
llvm-svn: 327494
This makes using !dag more convenient in some cases.
Change-Id: I0a8c35e15ccd1ecec778fd1c8d64eee38d74517c
Differential revision: https://reviews.llvm.org/D44111
llvm-svn: 327493
This allows constructing DAG nodes with programmatically determined
names, and can simplify constructing DAG nodes in other cases as
well.
Also, add documentation and some very simple tests for the already
existing !con.
Change-Id: Ida61cd82e99752548d7109ce8da34d29da56a5f7
Differential revision: https://reviews.llvm.org/D44110
llvm-svn: 327492
Allows capturing a list of concrete instantiated defs.
This can be combined with foreach to create parallel sets of def
instantiations with less repetition in the source. This purpose is
largely also served by multiclasses, but in some cases multiclasses
can't be used.
The motivating example for this change is having a large set of
intrinsics, which are generated from the IntrinsicsBackend.td file
included by Intrinsics.td, and a corresponding set of instruction
selection patterns, which are generated via the backend's .td files.
Multiclasses cannot be used to eliminate the redundancy in this case,
because a multiclass cannot span both LLVM's common .td files and
the backend .td files at the same time.
Change-Id: I879e35042dceea542a5e6776fad23c5e0e69e76b
Differential revision: https://reviews.llvm.org/D44109
llvm-svn: 327121
The changes to FieldInit are required to make field references (Def.field)
work inside a ForeachDeclaration: previously, Def.field wasn't resolved
immediately when Def was already a fully resolved DefInit.
Change-Id: I9875baec2fc5aac8c2b249e45b9cf18c65ae699b
llvm-svn: 327120
Summary:
Only instantiate anonymous records once all variable references in template
arguments have been resolved. This allows patterns like the new test case,
which in practice can appear in expressions like:
class IntrinsicTypeProfile<list<LLVMType> ty, int shift> {
list<LLVMType> types =
!listconcat(ty, [llvm_any_ty, LLVMMatchType<shift>]);
}
class FooIntrinsic<IntrinsicTypeProfile P, ...>
: Intrinsic<..., P.types, ...>;
Without this change, the anonymous LLVMMatchType instantiation would
never get resolved.
Another consequence of this change is that anonymous inline
instantiations are uniqued via the folding set of the newly introduced
VarDefInit.
Change-Id: I7a7041a20e297cf98c9109b28d85e64e176c932a
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43756
llvm-svn: 326788
Summary:
There are various places where resolving and constant folds can
get stuck, especially around casts. We don't always signal an
error for those, because in many cases they can legitimately
occur without being an error in the "untaken branch" of an !if.
Change-Id: I3befc0e4234c8e6cc61190504702918c9f29ce5c
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43754
llvm-svn: 326786
Summary:
Distinguish two relationships between types: is-a and convertible-to.
For example, a bit is not an int or vice versa, but they can be
converted into each other (with range checks that you can think of
as "dynamic": unlike other type checks, those range checks do not
happen during parsing, but only once the final values have been
established).
Actually converting initializers between types is subtle: even
when values of type A can be converted to type B (e.g. int into
string), it may not be possible to do so with a concrete initializer
(e.g., a VarInit that refers to a variable of type int cannot
be immediately converted to a string).
For this reason, distinguish between getCastTo and convertInitializerTo:
the latter implements the actual conversion when appropriate, while
the former will first try to do the actual conversion and fall back
to introducing a !cast operation so that the conversion will be
delayed until variable references have been resolved.
To make the approach of adding !cast operations to work, !cast needs
to fallback to convertInitializerTo when the special string <-> record
logic does not apply.
This enables casting records to a subclass, although that new
functionality is only truly useful together with !isa, which will be
added in a later change.
The test is removed because it uses !srl on a bit sequence,
which cannot really be supported consistently, but luckily
isn't used anywhere either.
Change-Id: I98168bf52649176654ed2ec61a29bdb29970cfe7
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43753
llvm-svn: 326785
Summary:
No functional change intended. The removed code has a loop for
recursive resolving, which is superseded by the recursive
resolving done by the Resolver implementations.
Add a test case which was broken by an earlier version of this
change.
Change-Id: Ib208d037b77a8bbb725977f1388601fc984723d8
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43655
llvm-svn: 326784
Summary:
Allow RecordRecTy to represent the type "subclass of N superclasses",
where N may be zero. Furthermore, generate RecordRecTy instances only
with actual classes in the list.
Keeping track of multiple superclasses is required to resolve the type
of a list correctly in some cases. The old code relied on the incorrect
behavior of typeIsConvertibleTo, and an earlier version of this change
relied on a modified ordering of superclasses (it was committed in
r325884 and then reverted because unfortunately some of clang-tblgen's
backends depend on the ordering).
Previously, the DefInit for each Record would have a RecordRecTy of
that Record as its type. Now, all defs with the same superclasses will
share the same type.
This allows us to be more consistent about type checks involving records:
- typeIsConvertibleTo actually requires the LHS to be a subtype of the
RHS
- resolveTypes will return the least supertype of given record types in
all cases
- different record types in the two branches of an !if are handled
correctly
Add a test that used to be accepted without flagging the obvious type
error.
Change-Id: Ib366db1a4e6a079f1a0851e469b402cddae76714
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43680
llvm-svn: 326783
Summary:
Use the new resolver interface more explicitly, and avoid traversing
all the initializers multiple times.
Add a test case for a pattern that was broken by an earlier version
of this change.
An additional change is that we now remove *all* template arguments
after resolving them.
Change-Id: I86c828c8cc84c18b052dfe0f64c0d5cbf3c4e13c
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43652
llvm-svn: 326706
Summary:
This changes the syntax of !foreach so that the first "parameter" is
a new syntactic variable: !foreach(x, lst, expr) will define the
variable x within the scope of expr, and evaluation of the !foreach
will substitute elements of the given list (or dag) for x in expr.
Aside from leading to a nicer syntax, this allows more complex
expressions where x is deeply nested, or even constant expressions
in which x does not occur at all.
!foreach is currently not actually used anywhere in trunk, but I
plan to use it in the AMDGPU backend. If out-of-tree targets are
using it, they can adjust to the new syntax very easily.
Change-Id: Ib966694d8ab6542279d6bc358b6f4d767945a805
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits, tpr
Differential Revision: https://reviews.llvm.org/D43651
llvm-svn: 326705
Summary:
NAME has already worked for def in a multiclass, since the (protoype)
record including its NAME variable is created before parsing the
superclasses. Since defm's do not have an associated single record,
support for NAME has to be implemented differently here.
Original test cases provided by Artem Belevich (tra)
Change-Id: I933b74f328c0ff202e7dc23a35b78f3505760cc9
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43656
llvm-svn: 326700
This reverts r325884.
Clang's TableGen has dependencies on the exact ordering of superclasses.
Revert this change fully for now to fix the build.
Change-Id: Ib297f5571cc7809f00838702ad7ab53d47335b26
llvm-svn: 325891
Summary:
Only check whether the left-hand side type is a subclass (or equal to)
the right-hand side type.
This requires a further fix in handling !if expressions and in type
resolution.
Furthermore, reverse the order of superclasses so that resolveTypes will
find a least common ancestor at least in simple cases.
Add a test that used to be accepted without flagging the obvious type
error.
Change-Id: Ib366db1a4e6a079f1a0851e469b402cddae76714
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43559
llvm-svn: 325884