As suggested by rnk at D67643#1673043, instead of reading files
multiple times until an appropriate encoding is found, read them once
as binary, and then try to decode what was read.
For Python >= 3.5, don't fail when attempting to decode the
`diff_bytes` output in order to print it.
Avoid failures for Python 2.7 used on some Windows bots by
transforming diff output with `lit.util.to_string` before writing it
to stdout.
Finally, add some tests for encoding handling.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D68664
llvm-svn: 375018
When the parallel region is called directly in the sequential region,
the zeroed tid/bound id are used. But they must point to the different
memory locations as the parameters are marked as noalias.
llvm-svn: 375017
Do not generate non-existing sdwa instructions. It reduces the
number of generated instructions by 185.
Differential Revision: https://reviews.llvm.org/D69010
llvm-svn: 375016
This helps with testing and debugging for paths that are assumed
absolute.
It also uses a FileError to provide the file path it's trying to open.
llvm-svn: 375008
Currently, BPF backend is doing truncation elimination. If one truncation
is performed on a value defined by narrow loads, then it could be redundant
given BPF loads zero extend the destination register implicitly.
When the definition of the truncated value is a merging value (PHI node)
that could come from different code paths, then checks need to be done on
all possible code paths.
Above described optimization was introduced as r306685, however it doesn't
work when there is back-edge, for example when loop is used inside BPF
code.
For example for the following code, a zero-extended value should be stored
into b[i], but the "and reg, 0xffff" is wrongly eliminated which then
generates corrupted data.
void cal1(unsigned short *a, unsigned long *b, unsigned int k)
{
unsigned short e;
e = *a;
for (unsigned int i = 0; i < k; i++) {
b[i] = e;
e = ~e;
}
}
The reason is r306685 was trying to do the PHI node checks inside isel
DAG2DAG phase, and the checks are done on MachineInstr. This is actually
wrong, because MachineInstr is being built during isel phase and the
associated information is not completed yet. A quick search shows none
target other than BPF is access MachineInstr info during isel phase.
For an PHI node, when you reached it during isel phase, it may have all
predecessors linked, but not successors. It seems successors are linked to
PHI node only when doing SelectionDAGISel::FinishBasicBlock and this
happens later than PreprocessISelDAG hook.
Previously, BPF program doesn't allow loop, there is probably the reason
why this bug was not exposed.
This patch therefore fixes the bug by the following approach:
- The existing truncation elimination code and the associated
"load_to_vreg_" records are removed.
- Instead, implement truncation elimination using MachineSSA pass, this
is where all information are built, and keep the pass together with other
similar peephole optimizations inside BPFMIPeephole.cpp. Redundant move
elimination logic is updated accordingly.
- Unit testcase included + no compilation errors for kernel BPF selftest.
Patch Review
===
Patch was sent to and reviewed by BPF community at:
https://lore.kernel.org/bpf
Reported-by: David Beckett <david.beckett@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 375007
Summary:
This patch implements the `TargetInstrInfo::verifyInstruction` hook for RISC-V. Currently the hook verifies the machine instruction's immediate operands, to check if the immediates are within the expected bounds. Without the hook invalid immediates are not detected except when doing assembly parsing, so they are silently emitted (including being truncated when emitting object code).
The bounds information is specified in tablegen by using the `OperandType` definition, which sets the `MCOperandInfo`'s `OperandType` field. Several RISC-V-specific immediate operand types were created, which extend the `MCInstrDesc`'s `OperandType` `enum`.
To have the hook called with `llc` pass it the `-verify-machineinstrs` option. For Clang add the cmake build config `-DLLVM_ENABLE_EXPENSIVE_CHECKS=True`, or temporarily patch `TargetPassConfig::addVerifyPass`.
Review concerns:
- The patch adds immediate operand type checks that cover at least the base ISA. There are several other operand types for the C extension and one type for the F/D extensions that were left out of this initial patch because they introduced further design concerns that I felt were best evaluated separately.
- Invalid register classes (e.g. passing a GPR register where a GPRC is expected) are already caught, so were not included.
- This design makes the more abstract `MachineInstr` verification depend on MC layer definitions, which arguably is not the cleanest design, but is in line with how things are done in other parts of the target and LLVM in general.
- There is some duplication of logic already present in the `MCOperandPredicate`s. Since the `MachineInstr` and `MCInstr` notions of immediates are fundamentally different, this is currently necessary.
Reviewers: asb, lenary
Reviewed By: lenary
Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67397
llvm-svn: 375006
Summary:
Even though writelane doesn't have the same constraints as other valu
instructions it still can't violate the >1 SGPR operand constraint
Due to later register propagation (e.g. fixing up vgpr operands via
readfirstlane) changing writelane to only have a single SGPR is tricky.
This implementation puts a new check after SIFixSGPRCopies that prevents
multiple SGPRs being used in any writelane instructions.
The algorithm used is to check for trivial copy prop of suitable constants into
one of the SGPR operands and perform that if possible. If this isn't possible
put an explicit copy of Src1 SGPR into M0 and use that instead (this is
allowable for writelane as the constraint is for SGPR read-port and not
constant-bus access).
Reviewers: rampitec, tpr, arsenm, nhaehnle
Reviewed By: rampitec, arsenm, nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, mgorny, yaxunl, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D51932
Change-Id: Ic7553fa57440f208d4dbc4794fc24345d7e0e9ea
llvm-svn: 375004
Summary:
The move to a new, single namespace in r374962 left out some type definitions
from the old namespace and resulted in one naming conflict (`text`). This
revision adds aliases for those definitions and removes one of the `text`
functions from the new namespace.
Reviewers: alexfh
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69036
llvm-svn: 375003
When on windows gnu-ar treats member names as case insensitive. This
commit implements the same behaviour.
Differential Revision: https://reviews.llvm.org/D68033
llvm-svn: 375002
Since `-mfloat-abi=soft` is taken to mean turning off all uses of the
FP registers, it should turn off the MVE vector instructions as well
as NEON and scalar FP. But it wasn't doing so.
So the options `-march=armv8.1-m.main+mve.fp+fp.dp -mfloat-abi=soft`
would cause the underlying LLVM to //not// support MVE (because it
knows the real target feature relationships and turned off MVE when
the `fpregs` feature was removed), but the clang layer still thought
it //was// supported, and would misleadingly define the feature macro
`__ARM_FEATURE_MVE`.
The ARM driver code already has a long list of feature names to turn
off when `-mfloat-abi=soft` is selected. The fix is to add the missing
entries `mve` and `mve.fp` to that list.
Reviewers: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69025
llvm-svn: 375001
Summary:
This is something of a workaround to avoid a crash later on in type
legalizer (WidenVectorResult()).
Also added some f16 tests, including a non-working v3f16 case with
a FIXME.
Reviewers: arsenm, tpr, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68865
llvm-svn: 374993
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 374991
Summary:
Currently Thumb2InstrInfo.cpp uses a register class which is
auto-generated by tablegen. Such approach is fragile because
auto-generated classes might change when other register classes are
added. For example, before https://reviews.llvm.org/D62667
we were using GPRPair_with_gsub_1_in_rGPRRegClass, but had to
change it to GPRPair_with_gsub_1_in_GPRwithAPSRnospRegClass
because the former class stopped being generated (this did not change
the functionality though).
This patch adds a register class consisting of even-odd GPR register
pairs from (R0, R1) to (R10, R11), which excludes (R12, SP) and uses
it in Thumb2InstrInfo.cpp instead of
GPRPair_with_gsub_1_in_GPRwithAPSRnospRegClass.
Reviewers: ostannard, simon_tatham, dmgreen, efriedma
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69026
llvm-svn: 374990
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us.
llvm-svn: 374989
The static analyzer is warning about a potential null dereference, but in these cases we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 374988
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 374987
Summary:
Extend the SI Load/Store optimizer to merge MIMG load instructions. Handle
different flavours of image_load and image_sample instructions.
When the instructions of the same subclass differ only in dmask, merge
them and update dmask accordingly.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64911
llvm-svn: 374984
Summary:
Removes the 'using namespace' under the cursor and qualifies all accesses in the current file.
E.g.:
using namespace std;
vector<int> foo(std::map<int, int>);
Would become:
std::vector<int> foo(std::map<int, int>);
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68562
llvm-svn: 374982
Instead of inserting everything after the 'root' of the reduction,
insert all instructions as close to their operands as possible. This
can help reduce register pressure.
Differential Revision: https://reviews.llvm.org/D67392
llvm-svn: 374981
This adds the initial plumbing to support optimisation remarks in
the IR hardware-loop pass.
I have left a todo in a comment where we can improve the reporting,
and will iterate on that now that we have this initial support in.
Differential Revision: https://reviews.llvm.org/D68579
llvm-svn: 374980
In LiveDebugVariables.cpp:
Prior to this patch, UserValues were grouped into linked list chains. Each
chain was the union of two sets: { A: Matching Source variable } or
{ B: Matching virtual register }. A ptr to the heads (or 'leaders')
of each of these chains were kept in a map with the { Source variable } used
as the key (set A predicate) and another with { Virtual register } as key
(set B predicate).
There was a search through the chains in the function getUserValue looking for
UserValues with matching { Source variable, Complex expression, Inlined-at
location }. Essentially searching for a subset of A through two interleaved
linked lists of set A and B. Importantly, by design, the subset will only
contain one or zero elements here. That is to say a UserValue can be uniquely
identified by the tuple { Source variable, Complex expression, Inlined-at
location } if it exists.
This patch removes the linked list and instead uses a DenseMap to map
the tuple { Source variable, Complex expression, Inlined-at location }
to UserValue ptrs so that the getUserValue search predicate is this map key.
The virtual register map now maps a vreg to a SmallVector<UserVal *> so that
set B is still available for quick searches.
Reviewers: aprantl, probinson, vsk, dblaikie
Reviewed By: aprantl
Subscribers: russell.gallop, gbedwell, bjope, hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D68816
llvm-svn: 374979
Summary:
In the macOS 10.15 SDK the ability to link i386 binaries was removed and
in the corresponding OS it is not possible to run macOS i386 binaries.
The consequence of these changes meant that targets like `check-asan`
would fail because:
* Unit tests could not be linked for i386
* Lit tests for i386 would fail due to not being able to execute
compiled binaries.
The simplest fix to this is to simply disable building for i386 for
macOS when using the 10.15 SDK (or newer). This disables building the
i386 slice for most compiler-rt libraries and consequently disables the
unit and lit tests for macOS i386.
Note that because the `DARWIN_osx_ARCHS` CMake variable is a cache
variable this patch will have no affect on existing builds unless
the existing cache variable is deleted. The simplest way to deal with
this is delete existing builds and just do a fresh configure.
Note this should not affect the builtins which are managed with
the `DARWIN_osx_BUILTIN_ARCHS` CMake cache variable.
For those who wish to force using a particular set of architectures when
using newer SDKs passing `-DDARWIN_osx_ARCHS=i386;x86_64;x86_64h` to
CMake should provide a usable (but completely unsupported) workaround.
rdar://problem/55668535
rdar://problem/47939978
Reviewers: kubamracek, yln, azhar, kcc, dvyukov, vitalybuka, cryptoad, eugenis, thakis, phosek
Subscribers: mgorny, #sanitizers, llvm-commits
Tags: #llvm, #sanitizers
Differential Revision: https://reviews.llvm.org/D68292
llvm-svn: 374977
Similar to r374970, but I don't have a test for this.
PromoteTargetBoolean is intended to be use for legalizing an
operand that needs to be promoted. It picks its type based on
the return from getSetccResultType and is intended to be used
when we have freedom to pick the new type. But the return type
we need for WidenVecOp_SETCC is completely determined by the
type of the input node.
llvm-svn: 374972
PromoteTargetBoolean calls getSetccResultType to get the return
type. But we were passing it the setcc result type rather than the
setcc input type. This causes an issue on X86 with avx512vl where
the setcc result type for vXf16 vectors is vXi16 while the
result type for vXi16 vectors is vXi1.
There's really no guarantee that getSetccResultType is the type
we need here. So now we just grab the extend type from
getExtendForContent and extend to the original result VT of the
node we're splitting.
llvm-svn: 374970
Summary:
The workaround added in https://reviews.llvm.org/rL299575 appears to be
working around a bug in Android JB 4.1.x and 4.2.x (API 16 and 17).
Starting in API 16, Android added support for PIE binaries, but the
dynamic linker failed to initialize dlpi_addr to the address that the
executable was loaded at. The bug was fixed in Android JB 4.3.x (API 18).
Improve the true load bias calculation:
* The code was assuming that the first segment would be the PT_PHDR
segment. I think it's better to be explicit and search for PT_PHDR. (It
will be almost as fast in practice.)
* It's more correct to use p_vaddr rather than p_offset. If a PIE
executable is linked with a non-zero image base (e.g. lld's
-Wl,--image-base=xxxx), then we must use p_vaddr here.
The "phdr->p_vaddr < image_base" condition seems unnecessary and maybe
slightly wrong. If the OS were to load a binary at an address smaller than
a vaddr in the binary, we would still want to do this workaround.
The workaround is safe when the linker bug isn't present, because it
should calculate an image_base equal to dlpi_addr. Note that with API 21
and up, this workaround should never activate for dynamically-linked
objects, because non-PIE executables aren't allowed.
Consolidate the fix into a single block of code that calculates the true
image base, and make it clear that the fix no longer applies after API 18.
See https://github.com/android/ndk/issues/505 for details.
Reviewers: mclow.lists, srhines, danalbert, compnerd
Reviewed By: compnerd
Subscribers: srhines, krytarowski, christof, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D68971
llvm-svn: 374969
Summary:
ScriptInterpreterPython needs to save and restore sys.stdout and
friends when LLDB runs a python script.
It currently does this using FILE*, which is not optimal. If
whatever was in sys.stdout can not be represented as a FILE*, then
it will not be restored correctly when the script is finished.
It also means that if the debugger's own output stream is not
representable as a file, ScriptInterpreterPython will not be able
to redirect python's output correctly.
This patch updates ScriptInterpreterPython to represent files with
lldb_private::File, and to represent whatever the user had in
sys.stdout as simply a PythonObject.
This will make lldb interoperate better with other scripts or programs
that need to manipulate sys.stdout.
Reviewers: JDevlieghere, jasonmolenda, labath
Reviewed By: labath
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D68962
llvm-svn: 374964
Summary:
This revision introduces a new namespace, `clang::transformer`, to hold
the declarations for the Transformer library.
Reviewers: gribozavr
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68876
llvm-svn: 374962
Since r374600 clang emits base address selection entries. Currently
dsymutil does not support these entries and incorrectly interprets them
as location list entries.
This patch adds support for base address selection entries in dsymutil
and makes sure they are relocated correctly.
Thanks to Dave for coming up with the test case!
Differential revision: https://reviews.llvm.org/D69005
llvm-svn: 374957
Before this patch, changing the working directory of the RedirectingFS
would just forward to its external file system. This prevented us from
having a working directory that only existed in the VFS mapping.
This patch adds support for a virtual working directory in the
RedirectingFileSystem. It now keeps track of its own WD in addition to
updating the WD of the external file system. This ensures that we can
still fall through for relative paths.
This change was originally motivated by the reproducer infrastructure in
LLDB where we want to deal transparently with relative paths.
Differential revision: https://reviews.llvm.org/D65677
llvm-svn: 374955