When expanding the LoopStart, we try to remove the iteration count
calculation. However, if part of the calculation was also used to
calculate the number of elements we could end up deleting
instructions that were required to feed DLSTP/WLSTP.
Differential Revision: https://reviews.llvm.org/D73275
Summary:
Since D70431 the describeLoadedValue() hook takes a parameter register,
meaning that it can now be asked to describe any register. This means
that we can drop the difference between explicit and implicit defines
that we previously had in collectCallSiteParameters().
I have not found any case for any upstream targets where a parameter
register is only implicitly defined, and does not overlap with any
explicit defines. I don't know if such a case would even make sense. So
as far as I have tested, this patch should be a non-functional change.
However, this reduces the complexity of the code a bit, and it will
simplify the implementation of an upcoming patch which solves PR44118.
Reviewers: djtodoro, NikolaPrica, aprantl, vsk
Reviewed By: djtodoro, vsk
Subscribers: hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D73167
We have no good test for --needed-libs option.
The one we have as a part of Object/readobj-shared-object.test
is not complete.
In this patch I've did a minor NFC changes to the implementation and
added a test. This allowed to remove this piece from
Object/readobj-shared-object.test
Differential revision: https://reviews.llvm.org/D73174
We want that the *.py names for the tests have unique names but
the current ones are sometimes very simple (e.g., "TestUniquePtr.py")
and could collide with unrelated tests. This just gives all these
tests a "FromStdModule" suffix to make these collisions less likely.
It removes the Object/readobj-absent.test test and creates a one more case in
dyn-symbols.test we have.
Differential revision: https://reviews.llvm.org/D73169
We had no test for --hash-table in tools/llvm-readobj.
The one we had was in test/Object and checked that
it is possible to dump the hash table even when an object
doesn't have a section header table.
In this patch I created a test, moved and merged the existent one.
During moving I converted it to be YAML based to stop using the
precompiled binary.
Differential revision: https://reviews.llvm.org/D73105
Summary: LibcUnitTest is missing a dependency on LLVMSupport. This prevents building with shared libraries.
Reviewers: sivachandra
Subscribers: mgorny, MaskRay, libc-commits
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D73337
The checker bugprone-infinite-loop does not track changes of
variables in the initialization expression of a variable
declared inside the condition of the while statement. This
leads to false positives, similarly to the one in the bug
report https://bugs.llvm.org/show_bug.cgi?id=44618. This
patch fixes this issue by enabling tracking of the variables
of this expression as well.
Differential Revision: https://reviews.llvm.org/D73270
G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates
these intrinsics from __builtin_popcount and __builtin_popcountll.
Add lower and narrow scalar for G_CTPOP.
Lower G_CTPOP for MIPS32.
Differential Revision: https://reviews.llvm.org/D73216
llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef,
it tells whether zero as the first argument produces a defined result.
G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false)
intrinsics, clang generates these intrinsics from __builtin_ctz and
__builtin_ctzll.
G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true).
Clang generates such intrinsics as parts of expansion of builtin_ffs
and builtin_ffsll. It is also traditionally part of and many
algorithms that are now predicated on avoiding zero-value inputs.
Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ.
Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32.
Differential Revision: https://reviews.llvm.org/D73215
llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef,
it tells whether zero as the first argument produces a defined result.
MIPS clz instruction returns 32 for zero input.
G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false)
intrinsics, clang generates these intrinsics from __builtin_clz and
__builtin_clzll.
G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as
second argument. It is also traditionally part of and many algorithms
that are now predicated on avoiding zero-value inputs.
Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF).
Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32.
Differential Revision: https://reviews.llvm.org/D73214
When targeting mingw, current CMake (3.16) fails to get the right
flags for assembly source files for windows gnu/clang targets
(see https://gitlab.kitware.com/cmake/cmake/merge_requests/4287
for a fix), causing builds to fail due to `-fPIC` being unsupported
in clang for mingw targets
In the meantime, restore the behaviour from before c48974ffd7
selectively on mingw targets, treating the assembly files as C.
Differential Revision: https://reviews.llvm.org/D73436
and macro FUNCTION likewise. NFCI.
Some functions like fmuladd don't really have a node, we should divide
the declaration form those have node to avoid introducing fake nodes.
Differential Revision: https://reviews.llvm.org/D72871
After this change, we need to explicitly list the languages the
project uses, otherwise the assembly source files won't get built
at all.
Previously (before that commit), the assembly source files were
simply treated as C.
The toplevel llvm CMakeLists.txt adds these three languages, so
when building libunwind integrated as part of that, it works fine.
Based on exhaustive llvm-exegesis measurements.
There may still be some imperfections for LEA16r/LEA32r.
Much like was observed in D68646, i'm also measuring some outliers
with some specific registers.
The code for parsing of type-constraints in compound-requirements was not adapted for the new TryAnnotateTypeConstraint which
caused compound-requirements with scope specifiers to ignore them.
Also add regression tests for scope specifiers in type-constraints in more contexts.
Summary:
This allows ASTContext to store only one parent map, rather than storing
an entire parent map for each traversal mode used.
This is therefore a partial revert of commit 0a717d5b (Make it possible
control matcher traversal kind with ASTContext, 2019-12-06).
Reviewers: aaron.ballman, rsmith, rnk
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73388
Summary: masked_load and masked_store instructions require the alignment to be specified and a power of two. It seems to me that this requirement applies to masked_gather and masked_scatter as well.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73179
This commit changes the logic of `getBuiltinVariableValue` to get
or create the builtin variable in the nearest symbol table. This
will allow us to use this function in other partial conversion
cases where we haven't created the spv.module yet.
Differential Revision: https://reviews.llvm.org/D73416
This commit exposes the func op conversion pattern via a new
`populateBuiltinFuncToSPIRVPatterns` function from the standard
to SPIR-V conversion passs. This is structurally better given
that func op belongs to the builtin dialect. More importantly,
this makes the pattern reusable to other dialect to SPIR-V
dialect conversion as other dialect can well adopt builtin
func op instead of having its own. Besides, it's very common
to use func ops as test wrappers in lit tests, so test passes
will need to handle func ops too.
Differential Revision: https://reviews.llvm.org/D73421
Thus far certain SPIR-V ops have been required to be in spv.module.
While this provides strong verification to catch unexpected errors,
it's quite rigid and makes progressive lowering difficult. Sometimes
we would like to partially lower ops from other dialects, which may
involve creating ops like global variables that should be placed in
other module-like ops. So this commit relaxes the requirement of
such SPIR-V ops' scope to module-like ops. Similarly for function-
like ops.
Differential Revision: https://reviews.llvm.org/D73415
Pull out combineTargetShuffle code added in rG3fd5d1c6e7db into a helper function and extend it to handle shufps(shufps(load(),x),y) and shufps(y,shufps(load(),x)) cases as well.
This change added two new attributes, rounding mode and exception
behavior to the structure FPOptions. These attributes allow more
flexible treatment of specific floating point environment than it is
provided by #pragma STDC FENV_ACCESS.
Differential Revision: https://reviews.llvm.org/D65994
* Generalize the code added in D70637 and D70937. We should eventually remove the EM_MIPS special case.
* Handle R_PPC_LOCAL24PC the same way as R_PPC_REL24.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73424
Every case in the switch had a string version of themselves. Two
of them had a typo that used : instead of ::
By using a macro we can automate the string creation and avoid
the possibility of typos like this.
This is similar to what is done on the AMDGPU target.
-fno-pie produces a pair of non-GOT-non-PLT relocations R_PPC_ADDR16_{HA,LO} (R_ABS) referencing external
functions.
```
lis 3, func@ha
la 3, func@l(3)
```
In a -no-pie/-pie link, if func is not defined in the executable, a canonical PLT entry (st_value>0, st_shndx=0) will be needed.
References to func in shared objects will be resolved to this address.
-fno-pie -pie should fail with "can't create dynamic relocation ... against ...", so we just need to think about -no-pie.
On x86, the PLT entry passes the JMP_SLOT offset to the rtld PLT resolver.
On x86-64: the PLT entry passes the JUMP_SLOT index to the rtld PLT resolver.
On ARM/AArch64: the PLT entry passes &.got.plt[n]. The PLT header passes &.got.plt[fixed-index]. The rtld PLT resolver can compute the JUMP_SLOT index from the two addresses.
For these targets, the canonical PLT entry can just reuse the regular PLT entry (in PltSection).
On PPC32: PltSection (.glink) consists of `b PLTresolve` instructions and `PLTresolve`. The rtld PLT resolver depends on r11 having been set up to the .plt (GotPltSection) entry.
On PPC64 ELFv2: PltSection (.glink) consists of `__glink_PLTresolve` and `bl __glink_PLTresolve`. The rtld PLT resolver depends on r12 having been set up to the .plt (GotPltSection) entry.
We cannot reuse a `b PLTresolve`/`bl __glink_PLTresolve` in PltSection as a canonical PLT entry. PPC64 ELFv2 avoids the problem by using TOC for any external reference, even in non-pic code, so the canonical PLT entry scenario should not happen in the first place.
For PPC32, we have to create a PLT call stub as the canonical PLT entry. The code sequence sets up r11.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73399
We would previously try to evaluate atomic constraints of non-template functions as-is,
and since they are now unevaluated at first, this would cause incorrect evaluation (bugs #44657, #44656).
Substitute into atomic constraints of non-template functions as we would atomic constraints
of template functions, in order to rebuild the expressions in a constant-evaluated context.
Symbol information can be used to improve out-of-range/misalignment diagnostics.
It also helps R_ARM_CALL/R_ARM_THM_CALL which has different behaviors with different symbol types.
There are many (67) relocateOne() call sites used in thunks, {Arm,AArch64}errata, PLT, etc.
Rename them to `relocateNoSym()` to be clearer that there is no symbol information.
Reviewed By: grimar, peter.smith
Differential Revision: https://reviews.llvm.org/D73254