Include support for NetBSD core dumps from evbarm/aarch64 system,
and matching test cases for them.
Based on earlier work by Kamil Rytarowski.
Differential Revision: https://reviews.llvm.org/D60034
llvm-svn: 357399
This instruction writes a block of allocation tags
and stores zero to the associated data locations.
It differs from STGM by 1 bit and has the same
arguments.
The specification can be found here:
https://developer.arm.com/docs/ddi0596/c
Differential Revision: https://reviews.llvm.org/D60065
llvm-svn: 357397
This patch replaces the addition of VK_RISCV_CALL in RISCVMCCodeEmitter by
creating the RISCVMCExpr when tail/call are parsed, or in the codegen case
when the callee symbols are created.
This required adding a new CallSymbol operand to allow only adding
VK_RISCV_CALL to tail/call instructions.
This patch will allow further expansion of parsing and codegen to easily
include PLT symbols which must generate the R_RISCV_CALL_PLT relocation.
Differential Revision: https://reviews.llvm.org/D55560
Patch by Lewis Revill.
llvm-svn: 357396
The STGV/LDGV instructions were replaced with
STGM/LDGM. The encodings remain the same but there
is no longer writeback so there are no unpredictable
encodings to check for.
The specfication can be found here:
https://developer.arm.com/docs/ddi0596/c
Differential Revision: https://reviews.llvm.org/D60064
llvm-svn: 357395
Summary:
ODR errors are not necessarily true errors during the import of ASTs.
ASTMerge and CrossTU should use the warning equivalent of every CTU error,
while Sema should emit errors as before.
Reviewers: martong, a_sidorin, shafik, a.sidorin
Reviewed By: a_sidorin
Subscribers: rnkovacs, dkrupp, Szelethus, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58897
Patch by Endre Fulop!
llvm-svn: 357394
This patch adds an implementation of a PC-relative addressing sequence to be
used when -mcmodel=medium is specified. With absolute addressing, a 'medium'
codemodel may cause addresses to be out of range. This is because while
'medium' implies a 2 GiB addressing range, this 2 GiB can be at any offset as
opposed to 'small', which implies the first 2 GiB only.
Note that LLVM/Clang currently specifies code models differently to GCC, where
small and medium imply the same functionality as GCC's medlow and medany
respectively.
Differential Revision: https://reviews.llvm.org/D54143
Patch by Lewis Revill.
llvm-svn: 357393
The latest version of the MTE spec added a system
register 'GMID_EL1'. It contains the block size used
by the LDGM and STGM instructions and is read only.
The specification can be found here:
https://developer.arm.com/docs/ddi0596/c
llvm-svn: 357392
According to OpenMP 5.0 standard, 2.11.4 allocate Clause, Restrictions,
For any list item that is specified in the allocate clause on a
directive, a data-sharing attribute clause that may create a private
copy of that list item must be specified on the same directive. Patch
adds the checks for this restriction.
llvm-svn: 357390
Summary:
This fixes PR41270.
The recursive function evaluateInDifferentElementOrder expects to be called
on a vector Value, so when we call it on a vector GEP's arguments, we must
first check that the argument is indeed a vector.
Reviewers: reames, spatel
Reviewed By: spatel
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60058
llvm-svn: 357389
This reverts commit 75216a6dbcfe5fb55039ef06a07e419fa875f4a5.
I'll recommit with a better commit message with reference to the
phabricator review.
llvm-svn: 357387
This fixes PR41270.
The recursive function evaluateInDifferentElementOrder expects to be called
on a vector Value, so when we call it on a vector GEP's arguments, we must
first check that the argument is indeed a vector.
llvm-svn: 357385
We introduce a new class hierarchy for debug types merging (in DebugTypes.h). The end-goal is to parallelize the type merging - please see the plan in D59226.
Previously, dependency discovery was done on the fly, much later, during the type merging loop. Unfortunately, parallelizing the type merging requires the dependencies to be merged in first, before any dependent ObjFile, thus this early discovery.
The overall intention for this path is to discover debug information dependencies at a much earlier stage, when processing input files. Currently, two types of dependency are supported: PDB type servers (when compiling with MSVC /Zi) and precompiled headers OBJs (when compiling with MSVC /Yc and /Yu). Once discovered, an explicit link is added into the dependent ObjFile, through the new debug types class hierarchy introduced in DebugTypes.h.
Differential Revision: https://reviews.llvm.org/D59053
llvm-svn: 357383
If we have a commutable vector binop with inverted select-shuffles,
we don't care about the order of the operands in each vector lane:
LHS = shuffle V1, V2, <0, 5, 6, 3>
RHS = shuffle V2, V1, <0, 5, 6, 3>
LHS + RHS --> <V1[0]+V2[0], V2[1]+V1[1], V2[2]+V1[2], V1[3]+V2[3]> --> V1 + V2
PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...is currently titled as an SLP enhancement, but at least for the
given example, we can reduce that in instcombine because we are just
eliminating shuffles.
As noted in the TODO, this could be generalized, but I haven't thought
through those patterns completely, so this is limited to what appears
to be always safe.
Differential Revision: https://reviews.llvm.org/D60048
llvm-svn: 357382
Summary:
The missing `<` causes the lld command to override the test file, which fails in
environments marking the test files as readonly.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60060
llvm-svn: 357380
Adds a `seto` pattern expansion. Without it the lowerings of `fcmp one` and
`fcmp ord` would be inefficient due to an unoptimized double negation.
Differential Revision: https://reviews.llvm.org/D59699
llvm-svn: 357378
Summary:
Some synthetic sections can be empty while still being needed, thus they
can't be removed by removeUnusedSyntheticSections(). Rename this member
function to more appropriate isNeeded() with the opposite meaning.
No functional change intended.
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: jhenderson, grimar, emaste, arichardson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59982
llvm-svn: 357377
Summary:
We're using ptrace(PTRACE_SETREGSET, NT_X86_XSTATE) to write all non-gpt
registers on x86 linux. Unfortunately, this method has a quirk, where
the kernel rejects all attempts to write to this area if one supplies a
buffer which is smaller than the area size (even though the kernel will
happily accept partial reads from it).
This means that if the CPU supports some new registers/extensions that
we don't know about (in my case it was the PKRU extension), we will fail
to write *any* non-gpr registers, even those that we know about.
Since this is a situation that's likely to appear again and again, I add
code to NativeRegisterContextLinux_x86_64 to detect the runtime size of
the area, and allocate an appropriate buffer. This does not mean that we
will start automatically supporting all new extensions, but it does mean
that the new extensions will not prevent the old ones from working.
This fixes tests attempting to write to non-gpr registers on new intel
processors (cca Kaby Lake Refresh).
Reviewers: jankratochvil, davezarzycki
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D59991
llvm-svn: 357376
A pcrel_lo will point to the associated pcrel_hi fixup which in turn points to
the real target. RISCVMCExpr::evaluatePCRelLo will work around this
indirection in order to allow the fixup to be evaluate properly. However, if
relocations are forced (e.g. due to linker relaxation is enabled) then its
evaluation is undesired and will result in a relocation with the wrong target.
This patch modifies evaluatePCRelLo so it will not try to evaluate if the
fixup will be forced as a relocation. A new helper method is added to
RISCVAsmBackend to query this.
Differential Revision: https://reviews.llvm.org/D59686
llvm-svn: 357374
Summary: There's a typo in the docs, as mentioned in the title. Please see the diff.
Reviewers: JonasToth
Subscribers: sylvestre.ledru, nemanjai, kbarton, cfe-commits
Tags: #clang-tools-extra, #clang
Differential Revision: https://reviews.llvm.org/D60050
llvm-svn: 357371
Summary:
Currently the C++03 implementation of common_type has much different behavior than the C++11 one. This causes bugs, including inside `<chrono>`.
This patch unifies the two implementations as best it can. The more code they share, the less their behavior can diverge.
Reviewers: mclow.lists, ldionne, sbenza
Reviewed By: mclow.lists, ldionne
Subscribers: libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D59678
llvm-svn: 357370
One motivation for making this change is that the lack of using movmsk is likely
a main source of perf difference between clang and gcc on the C-Ray benchmark as
shown here:
https://www.phoronix.com/scan.php?page=article&item=gcc-clang-2019&num=5
...but this change alone isn't enough to solve that problem.
The 'all-of' examples show what is likely the worst case trade-off: we end up with
an extra instruction (or 2 if we count the 'xor' register clearing). The 'any-of'
examples look clearly better using movmsk because we've traded 2 vector instructions
for 2 scalar instructions, and movmsk may have better timing than the generic 'movq'.
If we examine the llvm-mca output for these cases, it appears that even though the
'all-of' movmsk variant looks worse on paper, it would perform better on both
Haswell and Jaguar.
$ llvm-mca -mcpu=haswell no_movmsk.s -timeline
Iterations: 100
Instructions: 400
Total Cycles: 504
Total uOps: 400
Dispatch Width: 4
uOps Per Cycle: 0.79
IPC: 0.79
Block RThroughput: 1.0
$ llvm-mca -mcpu=haswell movmsk.s -timeline
Iterations: 100
Instructions: 600
Total Cycles: 358
Total uOps: 600
Dispatch Width: 4
uOps Per Cycle: 1.68
IPC: 1.68
Block RThroughput: 1.5
$ llvm-mca -mcpu=btver2 no_movmsk.s -timeline
Iterations: 100
Instructions: 400
Total Cycles: 407
Total uOps: 400
Dispatch Width: 2
uOps Per Cycle: 0.98
IPC: 0.98
Block RThroughput: 2.0
$ llvm-mca -mcpu=btver2 movmsk.s -timeline
Iterations: 100
Instructions: 600
Total Cycles: 311
Total uOps: 600
Dispatch Width: 2
uOps Per Cycle: 1.93
IPC: 1.93
Block RThroughput: 3.0
Finally, there may be CPUs where movmsk is horribly slow (old AMD small cores?), but if
that's true, then we're also almost certainly making the wrong transform already for
reductions with >2 elements, so that should be fixed independently.
Differential Revision: https://reviews.llvm.org/D59997
llvm-svn: 357367
In PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...we have a case where we want to fold a binop of select-shuffle (blended) values.
Rather than try to match commuted variants of the pattern, we can canonicalize the
shuffles and check for mask equality with commuted operands.
We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a
special case that the backend is required to handle because we already canonicalize
vector select to this shuffle form.
So there should be no codegen difference from this change. It's possible that this
improves CSE in IR though.
Differential Revision: https://reviews.llvm.org/D60016
llvm-svn: 357366
Summary:
Based on a patch by Dustin Howett, modified to not change the ABI for
ELF platforms.
Use more Windows-like section names.
This also makes things more readable by PE/COFF debug tools that assume
sections fit in the first header.
With these changes in, it is now possible to build a working WinObjC
with clang and the WinObjC version of GNUstep libobjc (upstream GNUstep
libobjc + a work around for incremental linking, which can be removed
once LINK.EXE gains a feature to opt sections out of receiving extra
padding during an incremental link).
Patch by Dustin Howett!
Reviewers: DHowett-MSFT
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58724
llvm-svn: 357364
Without this change, linking multiple objects containing block
descriptors together on Windows will generate duplicate symbol errors.
Patch by Dustin Howett!
Differential Revision: https://reviews.llvm.org/D58807
llvm-svn: 357363
Straightforward port of StatepointIRVerifier pass to new Pass Manager framework.
Fix By: skatkov
Reviewed By: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D59825
This is a re-land of r357147/r357148 with LLVM_ENABLE_MODULES build fixed.
Adding IR/SafepointIRVerifier.h into its own module.
llvm-svn: 357361
Negate updates flags like a subtract. We should be able to use the flags from the RMW form of negate when we have (store (X86ISD::SUB 0, load A), A)
Differential Revision: https://reviews.llvm.org/D60007
llvm-svn: 357353
This patch adds support for the RISC-V hard float ABIs, building on top of
rL355771, which added basic target-abi parsing and MC layer support. It also
builds on some re-organisations and expansion of the upstream ABI and calling
convention tests which were recently committed directly upstream.
A number of aspects of the RISC-V float hard float ABIs require frontend
support (e.g. flattening of structs and passing int+fp for fp+fp structs in a
pair of registers), and will be addressed in a Clang patch.
As can be seen from the tests, it would be worthwhile extending
RISCVMergeBaseOffsets to handle constant pool as well as global accesses.
Differential Revision: https://reviews.llvm.org/D59357
llvm-svn: 357352
Fixes PR41316 where the expanded PAVG intrinsic had had one of its ADDs turned into an OR due to its operands having no conflicting bits.
llvm-svn: 357351
vararg.ll previously missed RV64 tests. This patch also prepares for using
vararg.ll to test handling of varargs for the ilp32f/ilp32d/lp64f/lp64d hard
float ABIs. In these ABIs, varargs are passed as in either the ilp32 or lp64
ABI. Due to some slight codegen differences, different check lines are needed
for when RV32D is enabled.
llvm-svn: 357350