As suggested by rnk at D67643#1673043, instead of reading files
multiple times until an appropriate encoding is found, read them once
as binary, and then try to decode what was read.
For Python >= 3.5, don't fail when attempting to decode the
`diff_bytes` output in order to print it.
Avoid failures for Python 2.7 used on some Windows bots by
transforming diff output with `lit.util.to_string` before writing it
to stdout.
Finally, add some tests for encoding handling.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D68664
llvm-svn: 375018
Do not generate non-existing sdwa instructions. It reduces the
number of generated instructions by 185.
Differential Revision: https://reviews.llvm.org/D69010
llvm-svn: 375016
This helps with testing and debugging for paths that are assumed
absolute.
It also uses a FileError to provide the file path it's trying to open.
llvm-svn: 375008
Currently, BPF backend is doing truncation elimination. If one truncation
is performed on a value defined by narrow loads, then it could be redundant
given BPF loads zero extend the destination register implicitly.
When the definition of the truncated value is a merging value (PHI node)
that could come from different code paths, then checks need to be done on
all possible code paths.
Above described optimization was introduced as r306685, however it doesn't
work when there is back-edge, for example when loop is used inside BPF
code.
For example for the following code, a zero-extended value should be stored
into b[i], but the "and reg, 0xffff" is wrongly eliminated which then
generates corrupted data.
void cal1(unsigned short *a, unsigned long *b, unsigned int k)
{
unsigned short e;
e = *a;
for (unsigned int i = 0; i < k; i++) {
b[i] = e;
e = ~e;
}
}
The reason is r306685 was trying to do the PHI node checks inside isel
DAG2DAG phase, and the checks are done on MachineInstr. This is actually
wrong, because MachineInstr is being built during isel phase and the
associated information is not completed yet. A quick search shows none
target other than BPF is access MachineInstr info during isel phase.
For an PHI node, when you reached it during isel phase, it may have all
predecessors linked, but not successors. It seems successors are linked to
PHI node only when doing SelectionDAGISel::FinishBasicBlock and this
happens later than PreprocessISelDAG hook.
Previously, BPF program doesn't allow loop, there is probably the reason
why this bug was not exposed.
This patch therefore fixes the bug by the following approach:
- The existing truncation elimination code and the associated
"load_to_vreg_" records are removed.
- Instead, implement truncation elimination using MachineSSA pass, this
is where all information are built, and keep the pass together with other
similar peephole optimizations inside BPFMIPeephole.cpp. Redundant move
elimination logic is updated accordingly.
- Unit testcase included + no compilation errors for kernel BPF selftest.
Patch Review
===
Patch was sent to and reviewed by BPF community at:
https://lore.kernel.org/bpf
Reported-by: David Beckett <david.beckett@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 375007
Summary:
This patch implements the `TargetInstrInfo::verifyInstruction` hook for RISC-V. Currently the hook verifies the machine instruction's immediate operands, to check if the immediates are within the expected bounds. Without the hook invalid immediates are not detected except when doing assembly parsing, so they are silently emitted (including being truncated when emitting object code).
The bounds information is specified in tablegen by using the `OperandType` definition, which sets the `MCOperandInfo`'s `OperandType` field. Several RISC-V-specific immediate operand types were created, which extend the `MCInstrDesc`'s `OperandType` `enum`.
To have the hook called with `llc` pass it the `-verify-machineinstrs` option. For Clang add the cmake build config `-DLLVM_ENABLE_EXPENSIVE_CHECKS=True`, or temporarily patch `TargetPassConfig::addVerifyPass`.
Review concerns:
- The patch adds immediate operand type checks that cover at least the base ISA. There are several other operand types for the C extension and one type for the F/D extensions that were left out of this initial patch because they introduced further design concerns that I felt were best evaluated separately.
- Invalid register classes (e.g. passing a GPR register where a GPRC is expected) are already caught, so were not included.
- This design makes the more abstract `MachineInstr` verification depend on MC layer definitions, which arguably is not the cleanest design, but is in line with how things are done in other parts of the target and LLVM in general.
- There is some duplication of logic already present in the `MCOperandPredicate`s. Since the `MachineInstr` and `MCInstr` notions of immediates are fundamentally different, this is currently necessary.
Reviewers: asb, lenary
Reviewed By: lenary
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67397
llvm-svn: 375006
Summary:
Even though writelane doesn't have the same constraints as other valu
instructions it still can't violate the >1 SGPR operand constraint
Due to later register propagation (e.g. fixing up vgpr operands via
readfirstlane) changing writelane to only have a single SGPR is tricky.
This implementation puts a new check after SIFixSGPRCopies that prevents
multiple SGPRs being used in any writelane instructions.
The algorithm used is to check for trivial copy prop of suitable constants into
one of the SGPR operands and perform that if possible. If this isn't possible
put an explicit copy of Src1 SGPR into M0 and use that instead (this is
allowable for writelane as the constraint is for SGPR read-port and not
constant-bus access).
Reviewers: rampitec, tpr, arsenm, nhaehnle
Reviewed By: rampitec, arsenm, nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, mgorny, yaxunl, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D51932
Change-Id: Ic7553fa57440f208d4dbc4794fc24345d7e0e9ea
llvm-svn: 375004
When on windows gnu-ar treats member names as case insensitive. This
commit implements the same behaviour.
Differential Revision: https://reviews.llvm.org/D68033
llvm-svn: 375002
Summary:
This is something of a workaround to avoid a crash later on in type
legalizer (WidenVectorResult()).
Also added some f16 tests, including a non-working v3f16 case with
a FIXME.
Reviewers: arsenm, tpr, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68865
llvm-svn: 374993
Summary:
Currently Thumb2InstrInfo.cpp uses a register class which is
auto-generated by tablegen. Such approach is fragile because
auto-generated classes might change when other register classes are
added. For example, before https://reviews.llvm.org/D62667
we were using GPRPair_with_gsub_1_in_rGPRRegClass, but had to
change it to GPRPair_with_gsub_1_in_GPRwithAPSRnospRegClass
because the former class stopped being generated (this did not change
the functionality though).
This patch adds a register class consisting of even-odd GPR register
pairs from (R0, R1) to (R10, R11), which excludes (R12, SP) and uses
it in Thumb2InstrInfo.cpp instead of
GPRPair_with_gsub_1_in_GPRwithAPSRnospRegClass.
Reviewers: ostannard, simon_tatham, dmgreen, efriedma
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69026
llvm-svn: 374990
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
Instead of inserting everything after the 'root' of the reduction,
insert all instructions as close to their operands as possible. This
can help reduce register pressure.
Differential Revision: https://reviews.llvm.org/D67392
llvm-svn: 374981
This adds the initial plumbing to support optimisation remarks in
the IR hardware-loop pass.
I have left a todo in a comment where we can improve the reporting,
and will iterate on that now that we have this initial support in.
Differential Revision: https://reviews.llvm.org/D68579
llvm-svn: 374980
In LiveDebugVariables.cpp:
Prior to this patch, UserValues were grouped into linked list chains. Each
chain was the union of two sets: { A: Matching Source variable } or
{ B: Matching virtual register }. A ptr to the heads (or 'leaders')
of each of these chains were kept in a map with the { Source variable } used
as the key (set A predicate) and another with { Virtual register } as key
(set B predicate).
There was a search through the chains in the function getUserValue looking for
UserValues with matching { Source variable, Complex expression, Inlined-at
location }. Essentially searching for a subset of A through two interleaved
linked lists of set A and B. Importantly, by design, the subset will only
contain one or zero elements here. That is to say a UserValue can be uniquely
identified by the tuple { Source variable, Complex expression, Inlined-at
location } if it exists.
This patch removes the linked list and instead uses a DenseMap to map
the tuple { Source variable, Complex expression, Inlined-at location }
to UserValue ptrs so that the getUserValue search predicate is this map key.
The virtual register map now maps a vreg to a SmallVector<UserVal *> so that
set B is still available for quick searches.
Reviewers: aprantl, probinson, vsk, dblaikie
Reviewed By: aprantl
Subscribers: russell.gallop, gbedwell, bjope, hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D68816
llvm-svn: 374979
Similar to r374970, but I don't have a test for this.
PromoteTargetBoolean is intended to be use for legalizing an
operand that needs to be promoted. It picks its type based on
the return from getSetccResultType and is intended to be used
when we have freedom to pick the new type. But the return type
we need for WidenVecOp_SETCC is completely determined by the
type of the input node.
llvm-svn: 374972
PromoteTargetBoolean calls getSetccResultType to get the return
type. But we were passing it the setcc result type rather than the
setcc input type. This causes an issue on X86 with avx512vl where
the setcc result type for vXf16 vectors is vXi16 while the
result type for vXi16 vectors is vXi1.
There's really no guarantee that getSetccResultType is the type
we need here. So now we just grab the extend type from
getExtendForContent and extend to the original result VT of the
node we're splitting.
llvm-svn: 374970
Since r374600 clang emits base address selection entries. Currently
dsymutil does not support these entries and incorrectly interprets them
as location list entries.
This patch adds support for base address selection entries in dsymutil
and makes sure they are relocated correctly.
Thanks to Dave for coming up with the test case!
Differential revision: https://reviews.llvm.org/D69005
llvm-svn: 374957
Before this patch, changing the working directory of the RedirectingFS
would just forward to its external file system. This prevented us from
having a working directory that only existed in the VFS mapping.
This patch adds support for a virtual working directory in the
RedirectingFileSystem. It now keeps track of its own WD in addition to
updating the WD of the external file system. This ensures that we can
still fall through for relative paths.
This change was originally motivated by the reproducer infrastructure in
LLDB where we want to deal transparently with relative paths.
Differential revision: https://reviews.llvm.org/D65677
llvm-svn: 374955
RTDyldObjectLinkingLayer allowed clients to register a NotifyEmitted function to
reclaim ownership of object buffers once they had been linked. This patch adds
similar functionality to ObjectLinkingLayer: Clients can now optionally call the
ObjectLinkingLayer::setReturnObjectBuffer method to register a function that
will be called when discarding object buffers. If set, this function will be
called to return ownership of the object regardless of whether the link
succeeded or failed.
Use cases for this function include debug dumping (it provides a way to dump
all objects linked into JIT'd code) and object re-use (e.g. storing an
object in a cache).
llvm-svn: 374951
InProcessMemoryManager used to make separate memory allocation calls for each
permission level (RW, RX, RO), which could lead to target-out-of-range errors
if data and code were placed too far apart (this was the source of failures in
the JITLink/AArch64 testcase when it was first landed).
This patch updates InProcessMemoryManager to allocate a single slab which is
subdivided between text and data. This should guarantee that accesses remain
in-range provided that individual object files do not exceed 1Mb in size.
This patch also re-enables the JITLink/AArch64 testcase.
llvm-svn: 374948
This essentially reverts a commit [1] that removed the adaptor for
Python unittests. The code has been slightly refactored to make it more
additive: all code is contained in LitTestCase.py.
Usage sites will require a small adaption:
```
[old]
import lit.discovery
...
test_suite = lit.discovery.load_test_suite(...)
[new]
import lit.LitTestCase
...
test_suite = lit.LitTestCase.load_test_suite(...)
```
This was put back on request by Daniel Dunbar, since I wrongly assumed
that the functionality is unused. At least llbuild still uses this [2].
[1] 70ca752ccf
[2] https://github.com/apple/swift-llbuild/blob/master/utils/Xcode/LitXCTestAdaptor/LitTests.py#L16
Reviewed By: ddunbar
Differential Revision: https://reviews.llvm.org/D69002
llvm-svn: 374947
Summary:
Two conditions could lead to infinite loops when processing PHI nodes in
SIFixSGPRCopies.
The first condition involves a REG_SEQUENCE that uses registers defined by both
a PHI and a COPY.
The second condition arises when a physical register is copied to a virtual
register which is then used in a PHI node. If the same virtual register is
copied to the same physical register, the result is an endless loop.
%0:sgpr_64 = COPY $sgpr0_sgpr1
%2 = PHI %0, %bb.0, %1, %bb.1
$sgpr0_sgpr1 = COPY %0
Reviewers: alex-t, rampitec, arsenm
Reviewed By: rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68970
llvm-svn: 374944
SUMMARY:
in the xcoff, if the number of relocation entries or line number entries is
overflow(large than or equal 65535) , there will be overflow section for it.
The interpret of overflow section is different with generic section header,
the patch implement parsing the overflow section.
Reviewers: hubert.reinterpretcast,sfertile,jasonliu
Subscribers: rupprecht, seiya
Differential Revision: https://reviews.llvm.org/D68575
llvm-svn: 374941
This reverts the original commit and the follow up:
Revert "[VirtualFileSystem] Support virtual working directory in the RedirectingFS"
Revert "[test] Update YAML mapping in VirtualFileSystemTest"
llvm-svn: 374935
Summary:
Renames `ExprType` to the more apt `BlockType` and adds a variant for
multivalue blocks. Currently non-void blocks are only generated at the
end of functions where the block return type needs to agree with the
function return type, and that remains true for multivalue
blocks. That invariant means that the actual signature does not need
to be stored in the block signature `MachineOperand` because it can be
inferred by `WebAssemblyMCInstLower` from the return type of the
parent function. `WebAssemblyMCInstLower` continues to lower block
signature operands to immediates when possible but lowers multivalue
signatures to function type symbols. The AsmParser and Disassembler
are updated to handle multivalue block types as well.
Reviewers: aheejin, dschuff, aardappel
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68889
llvm-svn: 374933
Summary:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 374931
Exposes an issue in getFauxShuffleMask where the OR(SHUFFLE,SHUFFLE) decode should always resolve zero/undef elements.
Part of the fix for PR43024 where ideally we shouldn't call resolveTargetShuffleAndZeroables for Depth == 0
llvm-svn: 374928
I removed this test to unblock the ARM bots while looking into failures
(r374915), and am reinstating it now with a fix.
I believe the problem was that counter ptr address I used,
'\0\0\6\0\1\0\0\1', set the high bits of the pointer, not the low bits
like I wanted. On x86_64 this superficially looks like it tests r370826,
but it doesn't, as it would have been caught before r370826. However, on
ARM (or, 32-bit hosts more generally), I suspect the high bits were
cleared, and you get a 'valid' profile.
I verified that setting the *low* bits of the pointer does trigger the
new condition:
-// Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error.
-RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw
+// Note: The CounterPtr here is off-by-one.
+//
+// Octal '\11' is 9 in decimal: this should push CounterOffset to 1. As there are two counters,
+// the profile reader should error out.
+RUN: printf '\11\0\6\0\1\0\0\0' >> %t.profraw
This reverts commit c7cf5b3e4b918c9769fd760f28485b8d943ed968.
llvm-svn: 374927
This is remaining part of rG41ca91f2995b: [AIX][XCOFF] Output XCOFF
object text section header and symbol entry for rogram code.
SUMMARY:
Original form of this patch is provided by Stefan Pintillie.
1. The patch try to output program code section header , symbol entry for
program code (PR) and Instruction into the raw text section.
2. The patch include how to alignment and layout the CSection in the text
section.
3. The patch also reorganize the code , put some codes into a function.
(XCOFFObjectWriter::writeSymbolTableEntryForControlSection)
Additional: We can not add raw data of text section test in the patch, If want
to output raw text section data,it need a function description patch first.
Reviewers: hubert.reinterpretcast, sfertile, jasonliu, xingxue.
Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsjji.
Differential Revision: https://reviews.llvm.org/D66969
llvm-svn: 374923
Check that a call has an attached MemoryAccess before calling
getClobbering on the instruction.
If no access is attached, the instruction does not access memory.
Resolves PR43441.
llvm-svn: 374920