As part of the unification of the debug format and the MIR format,
always print registers as lowercase.
* Only debug printing is affected. It now follows MIR.
Differential Revision: https://reviews.llvm.org/D40417
llvm-svn: 319187
This is needed for cases when the memory access is not as big as the width of
the data type. For instance, storing i1 (1 bit) would be done in a byte (8
bits).
Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a
size of 0, which for instance makes alias analysis return NoAlias even when
it shouldn't.
There are no tests as this was done as a follow-up to the bugfix for the case
where this was discovered (r318824). This handles more similar cases.
Review: Björn Petterson
https://reviews.llvm.org/D40339
llvm-svn: 319173
LLVM Coding Standards:
Function names should be verb phrases (as they represent actions), and
command-like function should be imperative. The name should be camel
case, and start with a lower case letter (e.g. openFile() or isFoo()).
Differential Revision: https://reviews.llvm.org/D40416
llvm-svn: 319168
Change LowerBUILD_VECTOR to use those functions. This commit will tempora-
rily affect constant vector generation (it will generate constant-extended
values instead of non-extended combines), but the code for the general case
should be better. The constant selection part will be fixed later.
llvm-svn: 318877
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490
Summary:
Make it possible to feed runtime information back to tablegen to enable
profile-guided tablegen-eration, detection of untested tablegen definitions, etc.
Being a cross-compiler by nature, LLVM will potentially collect data for multiple
architectures (e.g. when running 'ninja check'). We therefore need a way for
TableGen to figure out what data applies to the backend it is generating at the
time. This patch achieves that by including the name of the 'def X : Target ...'
for the backend in the TargetRegistry.
Reviewers: qcolombet
Reviewed By: qcolombet
Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev
Differential Revision: https://reviews.llvm.org/D39742
llvm-svn: 318352
Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred
as 1. D37065 sets the previously inferred properties explicitly. This patch sets
hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has
been updated so it still returns false for PHI.
Additionally, HexagonBitSimplify relied on a PHI node having the
hasUnmodeledSideEffects property. This patch fixes that assumption.
Differential Revision: https://reviews.llvm.org/D37097
llvm-svn: 317721
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.
llvm-svn: 317647
An "or" that sets the sign-bit can be replaced with a "xor", if
the sign-bit was known to be clear before. With some changes to
instruction combining, the simple sign-bit check was failing.
Replace it with a more flexible one to catch more cases.
llvm-svn: 317592
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.
This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.
llvm-svn: 317379
In getOffsetRange, Max can be set to 0 to force the extender replacement
to be at or below the original value. This would cause the new offset to
be non-negative, which is preferred for memory instructions (to reduce
the likelihood of it getting constant-extended due to predication). The
problem happens when the range is shifted by an offset (present in the
instruction being examined) and the offset is negative. The entire range
for the allowable deviation will then be strictly negative. This creates
a problem, since 0 is assumed to be a valid deviation.
llvm-svn: 316601
In HexagonISelLowering, there is code to handle the case when
a function returns an i1 type. In this case, we need to generate
extra nodes to copy the result from R0 to a predicate register.
The code was returning the wrong value for the chain edge which
caused an assert "Wrong topological sorting" when converting the
instructions to MIs.
This patch fixes the problem by returning the chain for the final
copy.
Patch by Brendon Cahoon.
llvm-svn: 316367
Normally, if the registers holding the induction variable's bounds
are redefined inside of the loop's body, the loop cannot be converted
to a hardware loop. However, if the redefining instruction is actually
loading an immediate value into the register, this conversion is both
possible and legal (since the immediate itself will be used in the
loop setup in the preheader).
llvm-svn: 316218
This patch lets the llvm tools handle the new HVX target features that
are added by frontend (clang). The target-features are of the form
"hvx-length64b" for 64 Byte HVX mode, "hvx-length128b" for 128 Byte mode HVX.
"hvx-double" is an alias to "hvx-length128b" and is soon will be deprecated.
The hvx version target feature is upgated form "+hvx" to "+hvxv{version_number}.
Eg: "+hvxv62"
For the correct HVX code generation, the user must use the following
target features.
For 64B mode: "+hvxv62" "+hvx-length64b"
For 128B mode: "+hvxv62" "+hvx-length128b"
Clang picks a default length if none is specified. If for some reason,
no hvx-length is specified to llvm, the compilation will bail out.
There is a corresponding clang patch.
Differential Revision: https://reviews.llvm.org/D38851
llvm-svn: 316101
All loads of form V6_vL32b_{,cur,nt,tmp,nt_cur,nt_tmp}_{ai,pi,ppu} are
predicable on v62 (but not on v60). Mark them all as predicable in the
instruction definitions, and handle the v60 case in HII::isPredicable.
llvm-svn: 316098
Each constant extender requires an extra instruction, which adds to the
code size and also reduces the number of available slots in an instruction
packet. In most cases, the value of a repeated constant extender could be
loaded into a register, and the instructions using the extender could be
replaced with their counterparts that use that register instead.
This patch adds a pass that tries to reduce the number of constant
extenders, including extenders which differ only in an immediate offset
known at compile time, e.g. @global and @global+12.
llvm-svn: 315735
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.
This reverts commit r315633.
llvm-svn: 315637
Merge LLVMTargetMachine into TargetMachine.
- There is no in-tree target anymore that just implements TargetMachine
but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
interface.
Differential Revision: https://reviews.llvm.org/D38489
llvm-svn: 315633
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.
Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.
Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.
Differential Revision: https://reviews.llvm.org/D38406
llvm-svn: 315590
MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that,
and allows us to remove the last instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.
llvm-svn: 315531
The software pipeliner and the packetizer try to break dependence
between the post-increment instruction and the dependent memory
instructions by changing the base register and the offset value.
However, in some cases, the existing logic didn't work properly
and created incorrect offset value.
Patch by Jyotsna Verma.
llvm-svn: 315468
The pipeliner is generating a serial sequence that causes poor
register allocation when a post-increment instruction appears
prior to the use of the post-increment register. This occurs when
there is a circular set of dependences involved with a sequence
of instructions in the same cycle. In this case, there is no
serialization of the parallel semantics that will not cause an
additional register to be allocated.
This patch fixes the problem by changing the instructions so that
the post-increment instruction is used by the subsequent
instruction, which enables the register allocator to make a
better decision and not require another register.
Patch by Brendon Cahoon.
llvm-svn: 315466
This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.
The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.
llvm-svn: 315445
MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that,
and allows us to remove another instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.
llvm-svn: 315410
functions.
This makes the ownership of the resulting MCObjectWriter clear, and allows us
to remove one instance of MCObjectStreamer's bizarre "holding ownership via
someone else's reference" trick.
llvm-svn: 315327
ELFObjectWriter's constructor.
Fixes the same ownership issue for ELF that r315245 did for MachO:
ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to
pass this through to the constructor via a unique_ptr, rather than a raw ptr.
llvm-svn: 315254
The new format is changeAddrMode_xx_yy, where xx is the current mode,
and yy is the new one.
Old name: New name:
getBaseWithImmOffset changeAddrMode_abs_io
getAbsoluteForm changeAddrMode_io_abs
getBaseWithRegOffset changeAddrMode_io_rr
xformRegToImmOffset changeAddrMode_rr_io
getBaseWithLongOffset changeAddrMode_rr_ur
getRegShlForm changeAddrMode_ur_rr
llvm-svn: 315013
The old algoritm was not correct, although it worked most of the time.
Avoid the complex reachability analysis and simply calculate the maximal
registers out of the set of all referenced registers.
llvm-svn: 314991
If the two instructions being compared for equivalence have corresponding operands
that are integer constants, then check their values to determine equivalence.
Patch by Suyog Sarda!
llvm-svn: 314642
This patch produces a crash and hexagon_vector_loop_carried_reuse_constant.ll test fails on Windows (llvm-clang-x86_64-expensive-checks-win build bot).
llvm-svn: 314361
Add two callbacks to MachineEvaluator, so that specific implementations
can specify more details about register classes:
- composeWithSubRegIndex(RC,Idx), to provide the register class for a
register from RC used in conjunction with a subregister index Idx.
- getPhysRegBitWidth(Reg), to provide the size in bits of the given
physical register.
llvm-svn: 314136
If the two instructions being compared for equivalence have corresponding operands
that are integer constants, then check their values to determine equivalence.
Patch by Suyog Sarda!
llvm-svn: 313993
This patch adds a pass that removes the computation of provably redundant
expressions that have been computed earlier in a previous iteration. It
relies on the use of PHIs to identify loop carried dependences.
This is scalar replacement for vector types.
llvm-svn: 313925
This removes the duplicate HVX instruction set for the 128-byte mode.
Single instruction set now works for both modes (64- and 128-byte).
llvm-svn: 313362
It used to return the actual field value from the instruction descriptor.
There is no reason for that, that value is not interesting in any way and
the specifics of its encoding in the descriptor should not be exposed.
llvm-svn: 313257
The check (assuming positive stride) for validity of memmove should be
(a) the destination is at a lower address than the source, or
(b) the distance between the source and destination is greater than or
equal the number of bytes copied.
For the second part it is sufficient to assume that the destination
is at a higher address, since the opposite case is covered by (a).
The distance calculation was previously done by subtracting the
pointers in the wrong order.
llvm-svn: 311650
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.
Differential Revision: https://reviews.llvm.org/D36160
llvm-svn: 310619
IMHO it is an antipattern to have a enum value that is Default.
At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.
This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.
llvm-svn: 309911
Certain operations require vector of i1 values. However, for Hexagon
architecture compatibility, they need to be represented as vector of i8.
Patch by Suyog Sarda.
llvm-svn: 309677
Summary:
Since r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.
Reviewers: MatzeB
Reviewed By: MatzeB
Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D35949
llvm-svn: 309553
Summary:
This silences a couple of implicit fallthrough warnings with GCC 7.1 in
this file.
Reviewers: colinl, kparzysz
Reviewed By: kparzysz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35889
llvm-svn: 309129
Changing mask argument type from const SmallVectorImpl<int>& to
ArrayRef<int>.
This came up in D35700 where a mask is received as an ArrayRef<int> and
we want to pass it to TargetLowering::isShuffleMaskLegal().
Also saves a few lines of code.
llvm-svn: 309085
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.
In order to achieve this, the following common code changes were made:
* New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
LSR should do instruction-based addressing evaluations by calling
isLegalAddressingMode() with the Instruction pointers.
* In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
not just loads or stores.
SystemZ changes:
* isLSRCostLess() implemented with Insns first, and without ImmCost.
* New function supportedAddressingMode() that is a helper for TTI methods
looking at Instructions passed via pointers.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262https://reviews.llvm.org/D35049
llvm-svn: 308729
The flag "-hexagon-emit-lut-text" (defaulted to false) is added to decide
on where to keep the switch generated lookup table.
Differential Revision: https://reviews.llvm.org/D34818
llvm-svn: 308316
The target-independent lowering works fine, except concatenating 32-bit
words. Add a pattern to generate A2_combinew instead of 64-bit asl/or.
llvm-svn: 308186
This is the LLVM part, adding definitions for
void @llvm.hexagon.Y2.dccleana(i8*)
void @llvm.hexagon.Y2.dccleaninva(i8*)
void @llvm.hexagon.Y2.dcinva(i8*)
void @llvm.hexagon.Y2.dczeroa(i8*)
void @llvm.hexagon.Y4.l2fetch(i8*, i32)
void @llvm.hexagon.Y5.l2fetch(i8*, i64)
The clang part will follow.
llvm-svn: 308032
The issue is not if the value is pcrel. It is whether we have a
relocation or not.
If we have a relocation, the static linker will select the upper
bits. If we don't have a relocation, we have to do it.
llvm-svn: 307730
The llvm flag "-hexagon-emit-lookup-tables" guards the generation
of lookup table generated from a switch statement.
Differential Revision: https://reviews.llvm.org/D34819
llvm-svn: 306877
This patch adds a new LLVM flag -hexagon-emit-jt-text which is defaulted to
"false". The value "true" emits the switch generated jump tables in text section.
Differential Revision: https://reviews.llvm.org/D34820
llvm-svn: 306872
The llvm flag "-hexagon-emit-lookup-tables" guards the generation
of lookup table from a switch statement.
Differential Revision: https://reviews.llvm.org/D34819
llvm-svn: 306869
The changes are a result of discussion of https://reviews.llvm.org/D33685.
It solves the following problem:
1. We can inform getGEPCost about simplified indices to help it with
calculating the cost. But getGEPCost does not take into account the
context which GEPs are used in.
2. We have getUserCost which can take the context into account but we cannot
inform about simplified indices.
With the changes getUserCost will have access to additional information
as getGEPCost has.
The one parameter getUserCost is also provided.
Differential Revision: https://reviews.llvm.org/D34057
llvm-svn: 306674
processFixupValue is called on every relaxation iteration. applyFixup
is only called once at the very end. applyFixup is then the correct
place to do last minute changes and value checks.
While here, do proper range checks again for fixup_arm_thumb_bl. We
used to do it, but dropped because of thumb2. We now do it again, but
use the thumb2 range.
llvm-svn: 306177
It causes an extra pass of the machine verifier to be added to the pass
manager, and causes test/CodeGen/Generic/llc-start-stop.ll to fail.
llvm-svn: 306140
The feeder instruction will be moved to right before the compare, so
the updating code should not be looking for kills past the compare.
llvm-svn: 306059
The second part of r305300: when placing the mux at the later location,
make sure that it won't use any register that was killed between the
two original instructions. Remove any such kills and transfer them to
the mux.
llvm-svn: 305553
Store-immediate instructions have a non-extendable offset. Since the
actual offset for a stack object is not known until much later, only
generate these stores when the stack size (at the time of instruction
selection) is small.
llvm-svn: 305305
When a mux instruction is created from a pair of complementary conditional
transfers, it can be placed at the location of either the earlier or the
later of the transfers. Since it will use the operands of the original
transfers, putting it in the earlier location may hoist a kill of a source
register that was originally further down. Make sure the kill flag is
removed if the register is still used afterwards.
llvm-svn: 305300
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
The initial assumption was that the simplification would converge to a
fixed point relatvely quickly. Turns out that there are legitimate situa-
tions where the complexity of the code causes it to take a large number
of iterations.
Two main changes:
- Instead of aborting upon hitting the limit, simply return nullptr.
- Reduce the limit to 10,000 from 100,000.
llvm-svn: 304441
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.
While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.
llvm-svn: 304247
For multiplications of 64-bit values (giving 64-bit result), detect
cases where the arguments are sign-extended 32-bit values, on a per-
operand basis. This will allow few patterns to match a wider variety
of combinations in which extensions can occur.
llvm-svn: 304223
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".
This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.
Reviewers: spatel, RKSimon, efriedma
Reviewed By: RKSimon
Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits
Differential Revision: https://reviews.llvm.org/D33530
llvm-svn: 304215
Summary:
Implements PR889
Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:
https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing
This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal. However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue. I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.
I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.
Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv
Reviewed By: chandlerc
Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D31261
llvm-svn: 303362
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.
The patterns replaced here are:
* Passes handling a null TargetMachine call
`getAnalysisIfAvailable<TargetPassConfig>`.
* Passes not handling a null TargetMachine
`addRequired<TargetPassConfig>` and call
`getAnalysis<TargetPassConfig>`.
* MachineFunctionPasses now use MF.getTarget().
* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.
This fixes a crash when running `llc -start-before prologepilog`.
PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.
Related to PR30324.
Differential Revision: https://reviews.llvm.org/D33222
llvm-svn: 303360
This patch adds min/max population count, leading/trailing zero/one bit counting methods.
The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.
Differential Revision: https://reviews.llvm.org/D32931
llvm-svn: 302925
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.
This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.
The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
affects all targets that use frame pseudo instructions and touched many
files although the changes are uniform.
- Access to frame properties are implemented using special instructions
rather than calls getOperand(N).getImm(). For X86 and ARM such
replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
instruction. These involve proper instruction initialization and
methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
frame parts initialized inside frame instruction pair and outside it.
The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.
Differential Revision: https://reviews.llvm.org/D32394
llvm-svn: 302527
Allocframe and the following stores on the stack have a latency of 2 cycles
when not in the same packet. This happens because R29 is needed early by the
store instruction. Since one of such stores can be packetized along with
allocframe and use old value of R29, we can assign it 0 cycle latency
while leaving latency of other stores to the default value of 2 cycles.
Patch by Jyotsna Verma.
llvm-svn: 302034
The packetizer needs to convert .cur instruction to its regular form if
the use is not in the same packet as the .cur. The code in the packetizer
handles one type of .cur, which is the vector load case. This patch
updates the packetizer so that it can undo all the .cur instructions.
In the test case, the .cur is the 128B version, but there are also the
post-increment versions.
Patch by Brendon Cahoon.
llvm-svn: 302032
The compiler was generating code that ends up ignoring a multiple
latency dependence between two instructions by scheduling the
intructions in back-to-back packets.
The packetizer needs to end a packet if the latency of the current
current insruction and the source in the previous packet is
greater than 1 cycle. This case occurs when there is still room in
the current packet, but scheduling the instruction causes a stall.
Instead, the packetizer should start a new packet. Also, if the
current packet already contains a stall, then it is okay to add
another instruction to the packet that also causes a stall. This
occurs when there are no instructions that can be scheduled in
between the producer and consumer instructions.
This patch changes the latency for loads to 2 cycles from 3 cycles.
This change refects that a load only needs to be separated by
one extra packet to eliminate the stall.
Patch by Ikhlas Ajbar.
llvm-svn: 301954
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.
NFC
llvm-svn: 301666
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
Differential Revision: https://reviews.llvm.org/D32376
llvm-svn: 301432
1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
- TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
- TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;
This will allow making those values depend on subtarget features in the
future.
Differential Revision: https://reviews.llvm.org/D31783
llvm-svn: 301221
This reverts commit r301105, 4, 3 and 1, as a follow up of the previous
revert, which broke even more bots.
For reference:
Revert "[APInt] Use operator<<= where possible. NFC"
Revert "[APInt] Use operator<<= instead of shl where possible. NFC"
Revert "[APInt] Use ashInPlace where possible."
PR32754.
llvm-svn: 301111
getRawData exposes the internal type of the APInt class directly to its users. Ideally we wouldn't expose such an implementation detail.
This patch fixes a few of the easy cases by using truncate, extract, or a rotate.
llvm-svn: 301105
Also, make a few changes to allow using the pass in .mir testcases.
Among other things, change the abbreviation from opt-amode to amode-opt,
because otherwise lit would expand the "opt" part to the full path to
the opt binary.
llvm-svn: 300707
From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments.
The variadic template is an obvious solution to both issues.
Differential Revision: https://reviews.llvm.org/D31070
llvm-svn: 299949
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.
From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.
llvm-svn: 299925
A number of backends (AArch64, MIPS, ARM) have been using
MCContext::reportError to report issues such as out-of-range fixup values in
their TgtAsmBackend. This is great, but because MCContext couldn't easily be
threaded through to the adjustFixupValue helper function from its usual
callsite (applyFixup), these backends ended up adding an MCContext* argument
and adding another call to applyFixup to processFixupValue. Adding an
MCContext parameter to applyFixup makes this unnecessary, and even better -
applyFixup can take a reference to MCContext rather than a potentially null
pointer.
Differential Revision: https://reviews.llvm.org/D30264
llvm-svn: 299529
This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation.
Differential Revision: https://reviews.llvm.org/D31565
llvm-svn: 299362
- Avoid explosive growth of the simplification queue by not queuing
expressions that are alredy in it.
- Add an iteration counter and abort after a sufficiently large number
of iterations (assuming that it's a symptom of an infinite loop).
llvm-svn: 298655
[Hexagon] Recognize polynomial-modulo loop idiom again
Regain the ability to recognize loops calculating polynomial modulo
operation. This ability has been lost due to some changes in the
preceding optimizations. Add code to preprocess the IR to a form
that the pattern matching code can recognize.
llvm-svn: 298400
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.
Rename AttributeSetImpl to AttributeListImpl to follow suit.
It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.
Reviewers: sanjoy, javed.absar, chandlerc, pete
Reviewed By: pete
Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
Differential Revision: https://reviews.llvm.org/D31102
llvm-svn: 298393
Regain the ability to recognize loops calculating polynomial modulo
operation. This ability has been lost due to some changes in the
preceding optimizations. Add code to preprocess the IR to a form
that the pattern matching code can recognize.
llvm-svn: 298282
This function will find the closest ref node aliased to Reg that is
in an instruction preceding Inst. This could be used to identify the
hypothetical reaching def of Reg, if Reg was a member of Inst.
llvm-svn: 297524
- Fix the insertion point, which occasionally could have been incorrect.
- Avoid creating multiple bitsplits with the same operands, if an old one
could be reused.
llvm-svn: 297414
When extracting a bitfield from the high register in a register pair,
the final offset should be relative to the high register (for 32-bit
extracts).
llvm-svn: 297288
Merge the tail block into the loop in cases where the main loop body
exits early, subject to profitability constraints. This will coalesce
the loop body into fewer blocks.
For example:
loop: loop:
// loop body // loop body
if (...) jump exit --> // more body
more: if (...) jump exit
// more body jump loop
jump loop
llvm-svn: 297033
The code in updateDeadFlags removed unnecessary <dead> flags, but there
can be cases where such a flag is not set, and yet a register has become
dead. For example, if a mux with identical inputs is replaced with a COPY,
the predicate register may no longer be used after that.
llvm-svn: 297032
On Hexagon, values of type i1 are passed in registers of type i32,
even though i1 is not a legal value for these registers. This is a
special case and needs special handling to maintain consistency of
the lowering information.
This fixes PR32089.
llvm-svn: 296645
In the bit tracker, references to other bit values in which the register
is 0 are prohibited. This means that generating self-referential register
cells like { w:32 [0-15]:s[0-15] [16-31]:s[15] } is impossible. In order
to get a self-referential cell, it had to be stored into a map and then
reloaded from it. To avoid this step, add a function that will set the
register to a given value without going through the map.
llvm-svn: 296025
Defining nodes should not alias with one another, while clobbering
nodes can. When pushing defs on stacks, push clobbers first, link
non-clobbering defs, then push the defs.
The data flow in a statement is now: uses -> clobbers -> defs.
llvm-svn: 295356