Currently the backend special cases x86_intrcc and treats the first
parameter as byval. Make the IR require byval for this parameter to
remove this special case, and avoid the dependence on the pointee
element type.
Fixes bug 46672.
I'm not sure the IR is enforcing all the calling convention
constraints. clang seems to ignore the attribute for empty parameter
lists, but the IR tolerates it.
In addition to making the code a lot easier to grasp by localizing many
helper functions to the only file where they are actually needed, this
will allow creating helper functions that depend on allocator_traits
outside of <memory>.
This is done as part of implementing array support in allocate_shared,
which requires non-trivial array initialization algorithms that would be
better to keep out of <memory> for sanity. It's also a first step towards
splitting up our monolithic headers into finer grained ones, which will
make it easier to reuse functionality across the library. For example,
it's just weird that we had to define `addressof` inside <type_traits>
to avoid circular dependencies -- instead it's better to implement those
in true helper headers.
Differential Revision: https://reviews.llvm.org/D93074
This patch add support of riscv multilibs in the Baremetal toolchain. It is
a bit different to what is done in GNU.cpp as we are not iterating a
GNU sysroot to find the multilibs. This is intended for an llvm only
toolchain. We are not checking for the presence of any runtime bits to
enable a specific multilib.
I have structured the patch so that other targets for which
there is no multilibs support yet in Baremetal.cpp (e.g. arm-none-eabi)
will not be affected. Patch also allows some multilibs reuse.
Long term, I would like to go in the direction of data-driven specification of
multilib directories and flags.
Reviewed By: jroelofs
Differential Revision: https://reviews.llvm.org/D93138
The entry block should always be the first BB in a function.
So we should not rotate a chain contains the entry block.
Differential Revision: https://reviews.llvm.org/D92882
his is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be mostly NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist). Moving the checks does end up changing some of the optimization warnings and debug output slightly, but nothing that appears to be a regression.
Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so.
Differential Revision: https://reviews.llvm.org/D92066
This will allow for caching pattern lists across multiple pass instances, such as when multithreading. This is an extremely important invariant for PDL patterns, which are compiled at runtime when the FrozenRewritePatternList is built.
Differential Revision: https://reviews.llvm.org/D93146
Before this patch, the Restorer depended on copy elision to happen.
Without copy elision, the function ScopedSet calls the move constructor
before its dtor. The dtor will prematurely restore the reference to the
original value.
Instead of relying the compiler to not use the Restorer's copy
constructor, delete its copy and assign operators. Hence, callers cannot
move or copy a Restorer object anymore, and have to explicitly provide
the reset state. ScopedSet avoids calling move/copy operations by
relying on unnamed return value optimization, which is mandatory in
C++17.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D88797
Some operations use integer literals as part of their custom format that don't necessarily map to an internal IntegerAttr. This revision exposes the same `parseInteger` functions as the DialectAsmParser to allow for these operations to parse integer literals without incurring the otherwise unnecessary roundtrip through IntegerAttr.
Differential Revision: https://reviews.llvm.org/D93152
This revision adds a new `printNewline` hook to OpAsmPrinter that allows for printing a newline within the custom format of an operation, that is then indented to the start of the operation. Support for the declarative assembly format is also added, in the form of a `\n` literal.
Differential Revision: https://reviews.llvm.org/D93151
`isCUDADeviceBuiltinSurfaceType()`/`isCUDADeviceBuiltinTextureType()` do not
work on dependent types as they rely on specific type attributes.
Differential Revision: https://reviews.llvm.org/D92893
My bot runs VS 2019, but it could not compile this code.
Message:
[55/2465] Building CXX object lib\Target\Hexagon\CMakeFiles\LLVMHexagonCodeGen.dir\HexagonVectorCombine.cpp.obj
FAILED: lib/Target/Hexagon/CMakeFiles/LLVMHexagonCodeGen.dir/HexagonVectorCombine.cpp.obj
...
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): error C2976: 'std::map': too few template arguments
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): note: see declaration of 'std::map'
The version in the path, 14.23, corresponds to _MSC_VER 1923, so raise
the version floor to 1924.
I have not tested with versions between 1924 and 1928 (latest), but the
latest works with the variadic version.
This moves the vtype decoding and printing to RISCVBaseInfo. This keeps all of
the decoding code in the same area as the encoding code. This will make it
easier to change the decoding for the 1.0 spec in the future.
We're now sharing the printing with the debug output for operands in the
assembler. This also fixes that debug output to include the tail and mask
agnostic bits. Since the printing code works on the vtype immediate value, we
now encode the immediate during parsing and store just the immediate in the
operand.
We currently do this for SANITIZER_IOS, which includes devices *and* simulators. This change opts out the check for simulators to unify the behavior with macOS, because VM size is really a property of the host OS, and not the simulator.
<rdar://problem/72129387>
Differential Revision: https://reviews.llvm.org/D93140
Fix a bug causing to pick the wrong vector size to broadcast to when the source
vectors have different ranks.
Differential Revision: https://reviews.llvm.org/D93118
- New function SDValue getBackchainAddress() used by
lowerDYNAMIC_STACKALLOC() and lowerSTACKRESTORE() to properly handle the
backchain offset also with packed-stack.
- Make a common function getBackchainOffset() for the computation of the
backchain offset and use in some places (NFC).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D93171
- Once an instruction is simplified, foldable candidates from it should
be invalidated or skipped as the operand index is no longer valid.
Differential Revision: https://reviews.llvm.org/D93174
When using the FixedLenDecoderEmitter, llvm-tblgen emits tables with (OPC_ExtractField, OPC_ExtractFilterValue) opcode sequences to match the contiguous fixed bits of a given instruction's encoding. This encoding is represented in a 64-bit integer. However, the filter values were represented in a 32-bit integer. As such, instructions with fixed 64-bit encodings resulted in a table with an OPC_ExtractField for all 64 bits, followed by an OPC_ExtractFilterValue containing just the low 32 bits of their encoding, causing the filter never to match.
The exact point at which the slicing occurred was during the map insertion at line 630.
Differential Revision: https://reviews.llvm.org/D92423
Follow the naming set by TI's own GCC-based toolchain.
Also, force the `osabi` field to `ELFOSABI_STANDALONE`, this matches GNU LD's output (the patching is done in `elf32_msp430_post_process_headers`).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92931
If a function happens to:
- call setjmp
- do a 16-byte stack allocation
- call a function that sets up a stack frame and longjmp's back
The stack pointer that is restores by setjmp will no longer point to a valid
back chain. According to the ABI, stack accesses in such a function are to be
frame pointer based - so it is an error (quite obviously) to restore the stack
from the back chain.
We already restore the stack from the frame pointer when there are calls to
fast_cc functions. We just need to also do that when there are calls to setjmp.
This patch simply does that.
This was pointed out by the Julia team.
Differential revision: https://reviews.llvm.org/D92906
This commit splits SPIR-V's serialization and deserialization code
into separate libraries. The motiviation being that the serializer
is used more often the deserializer and therefore lumping them
together unnecessarily increases binary size for the most common
case.
This commit also moves these libraries into the Target/ directory
to follow MLIR convention.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D91548
The import of a typedefs with an attribute uses clang::Decl::setAttrs().
But that needs the ASTContext which we can get only from the
TranslationUnitDecl. But we can get the TUDecl only thourgh the
DeclContext, which is not set by the time of the setAttrs call.
Fix: import the attributes only after the DC is surely imported.
Btw, having the attribute import initiated from GetImportedOrCreateDecl was
fundamentally flawed. Now that is implicitly fixed.
Differential Revision: https://reviews.llvm.org/D92962
Even though d38205144f was mostly a correct
fix for the external non-PHI users, it's not a *generally* correct fix,
because the 'placeholder' values in those trivial PHI's we create
shouldn't be *always* 'undef', but the PHI itself for the backedges,
else we end up with wrong value, as the `@pr48450_2` test shows.
But we can't just do that, because we can't check that the PHI
can be it's own incoming value when coming from certain predecessor,
because we don't have a dominator tree.
So until we can address this correctness problem properly,
ensure that we don't perform the transformation
if there are such problematic external uses.
Making dominator tree available there is going to be involved,
since `-simplifycfg` pass currently does not preserve/update domtree...
- std::reference_wrapper
- std::function
- std::mem_fn
While I'm here, remove _VSTD:: qualification from calls to `declval`
because it takes no arguments and thus isn't susceptible to ADL.
Differential Revision: https://reviews.llvm.org/D92884
Everywhere, normalize the whitespace to `::new (EXPR) T`.
Everywhere, normalize the spelling of the cast to `(void*)EXPR`.
Without the cast to `(void*)`, the expression triggers ADL on GCC.
(I think this is a GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98249)
Even if it doesn't trigger ADL, it still seems incorrect to use any argument
that's not exactly `(void*)` because that opens the possibility of overload
resolution picking a user-defined overload of `operator new`, which would be
wrong.
Differential Revision: https://reviews.llvm.org/D93153
clang-scan-deps contains some command line parsing and modifications.
This patch adds support for clang-cl command options.
Differential Revision: https://reviews.llvm.org/D92191
D82227 has added a proper check to limit PHI vectorization to the
maximum vector register size. That unfortunately resulted in at
least a couple of regressions on SystemZ and x86.
This change reverts PHI handling from D82227 and replaces it with
a more general check in SLPVectorizerPass::tryToVectorizeList().
Moved to tryToVectorizeList() it allows to restart vectorization
if initial chunk fails.
However, this function is more general and handles not only PHI
but everything which SLP handles. If vectorization factor would
be limited to maximum vector register size it would limit much
more vectorization than before leading to further regressions.
Therefore a new TTI callback getMaximumVF() is added with the
default 0 to preserve current behavior and limit nothing. Then
targets can decide what is better for them.
The callback gets ElementSize just like a similar getMinimumVF()
function and the main opcode of the chain. The latter is to avoid
regressions at least on the AMDGPU. We can have loads and stores
up to 128 bit wide, and <2 x 16> bit vector math on some
subtargets, where the rest shall not be vectorized. I.e. we need
to differentiate based on the element size and operation itself.
Differential Revision: https://reviews.llvm.org/D92059
We have this subtarget feature so it makes sense to use it here. This is
NFC because it's always defined by default on GFX8+.
Differential Revision: https://reviews.llvm.org/D93202
Add andm, orm, xorm, eqvm, nndm, negm, pcvm, lzvm, and tovm intrinsic
instructions, a few pseudo instructions to expand logical intrinsic
using VM512, a mechnism to expand such pseudo instructions, and
regression tests. Also, assign vector mask types and vector mask
register classes correctly. This is required to use VM512 registers
as function arguments.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D93093
This bug hasn't affected us yet as our usage is too basic, i.e. we don't
rely on the defaults provided by `SetDefaultFortranOpts` just yet. This
will change shortly.
Use SCEV to salvage additional @llvm.dbg.value that have turned into
referencing undef after transformation (and traditional
salvageDebugInfo). Before rewrite (but after introduction of new
induction variables) use SCEV to compute an equivalent set of values for
each @llvm.dbg.value in the loop body (among the loop header PHI-nodes).
After rewrite (and dead PHI elimination) update those @llvm.dbg.value
now referencing undef by picking a remaining value from its equivalence
set. Allow match with offset by inserting compensation code in the
DIExpression.
Fixes : PR38815
Differential Revision: https://reviews.llvm.org/D87494
Otherwise they come out in random (inode?) order.
Also `chmod +x` the generator, and re-run it. Somehow on Marek's
machine it produced \r\n line endings?! Open all files with
`newline='\n'` so that (if the Python3 docs are correct)
that won't happen again.
Differential Revision: https://reviews.llvm.org/D93137