Summary:
- If the coerced type is still a pointer, it should be set with proper
parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist
that (pointer) parameter attribute setting so that the coerced pointer
parameter could be marked properly.
Depends on D79394
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79395
Summary:
- Skip copying function arguments and unnecessary casting by using them
directly.
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79394
This addresses a compilation failure on GCC 5:
error: #error This file requires compiler and library support for the
ISO C++ 2011 standard. This support must be enabled with the -std=c++11
or -std=gnu++11 compiler options.
#error This file requires compiler and library support
Differential Revision: https://reviews.llvm.org/D79439
DMA operation classes in the Standard dialect (`DmaStartOp` and `DmaWaitOp`)
provide helper functions that make numerous assumptions about the number and
order of operands, and about their types. However, these assumptions were not
checked in the verifier, leading to assertion failures or crashes when helper
functions were used on ill-formed ops. Some of the assuptions were checked in
the custom parser (and thus could not check assumption violations in ops
constructed programmatically, e.g., during rewrites) and others were not
checked at all. Introduce the verifiers for all these assumptions and drop
unnecessary checks in the parser that are now covered by the verifier.
Addresses PR45560.
Differential Revision: https://reviews.llvm.org/D79408
When doing a standalone build of flang against an LLVM that contains a
built flang, the tests were run on the flang from LLVM rather than on
the one that was just built.
The problem was in the lit configuration for finding %flang etc.
Fix it to look only in the directory where it was built.
Differential Revision: https://reviews.llvm.org/D79327
Summary:
This patch teaches shouldBeDeferred to take into account the total
cost of inlining.
Suppose we have a call hierarchy {A1,A2,A3,...}->B->C. (Each of A1,
A2, A3, ... calls B, which in turn calls C.)
Without this patch, shouldBeDeferred essentially returns true if
TotalSecondaryCost < IC.getCost()
where TotalSecondaryCost is the total cost of inlining B into As.
This means that if B is a small wraper function, for example, it would
get inlined into all of As. In turn, C gets inlined into all of As.
In other words, shouldBeDeferred ignores the cost of inlining C into
each of As.
This patch adds an option, inline-deferral-scale, to replace the
expression above with:
TotalCost < Allowance
where
- TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and
- Allowance is IC.getCost() * Scale
For now, the new option defaults to -1, disabling the new scheme.
Reviewers: davidxl
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79138
We have the option to stop running commands in batch mode when an error
occurs. When that happens we should exit the driver with a non-zero exit
code.
Differential revision: https://reviews.llvm.org/D78825
Currently, normalize() for posix replaces backslashes to slashes, except
that two backslashes in sequence are kept as-is.
clang calls normalize() to convert \ to / is microsoft compat mode. This
generally works well, but a path like "c:\\foo\\bar.h" with two
backslashes doesn't work due to the exception in normalize().
These paths happen naturally on Windows hosts with e.g.
`#include __FILE__`, and them not working on other hosts makes it
more difficult to write tests for this case.
The special case has been around without justification since this code
was added in r203611 (since then moved around in r215241 r215243). No
integration tests fail if I remove it.
Try removing the special case.
Differential Revision: https://reviews.llvm.org/D79265
Summary:
Adds the loop unroll transformation for loop::ForOp.
Adds support for promoting the body of single-iteration loop::ForOps into its containing block.
Adds check tests for loop::ForOps with dynamic and static lower/upper bounds and step.
Care was taken to share code (where possible) with the AffineForOp unroll transformation to ease maintenance and potential future transition to a LoopLike construct on which loop transformations for different loop types can implemented.
Reviewers: ftynse, nicolasvasilache
Reviewed By: ftynse
Subscribers: bondhugula, mgorny, zzheng, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79184
The AMDGPU target has a convention that defined all VGPRs
(execept the initial 32 argument registers) as callee-saved.
This convention is not efficient always, esp. when the callee
requiring more registers, ended up emitting a large number of
spills, even though its caller requires only a few.
This patch revises the ABI by introducing more scratch registers
that a callee can freely use.
The 256 vgpr registers now become:
32 argument registers
112 scratch registers and
112 callee saved registers.
The scratch registers and the CSRs are intermixed at regular
intervals (a split boundary of 8) to obtain a better occupancy.
Reviewers: arsenm, t-tye, rampitec, b-sumner, mjbedy, tpr
Reviewed By: arsenm, t-tye
Differential Revision: https://reviews.llvm.org/D76356
A new test matcher class MPFRMatcher is added along with helper macros
EXPECT|ASSERT_MPFR_MATCH.
New type traits classes RemoveCV and IsFloatingPointType have been
added and used to implement the above class and its helpers.
Reviewers: abrachet, phosek
Differential Revision: https://reviews.llvm.org/D79256
This patch implements the final bits of CMSE code generation:
* emit special linker symbols
* restrict parameter passing to not use memory
* emit BXNS and BLXNS instructions for returns from non-secure entry
functions, and non-secure function calls, respectively
* emit code to save/restore secure floating-point state around calls
to non-secure functions
* emit code to save/restore non-secure floating-pointy state upon
entry to non-secure entry function, and return to non-secure state
* emit code to clobber registers not used for arguments and returns
when switching to no-secure state
Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green,
possibly others.
Differential Revision: https://reviews.llvm.org/D76518
This builds on the or-reduction bailout that was added with D67841.
We still do not have IR-level load combining, although that could
be a target-specific enhancement for -vector-combiner.
The heuristic is narrowly defined to catch the motivating case from
PR39538:
https://bugs.llvm.org/show_bug.cgi?id=39538
...while preserving existing functionality.
That is, there's an unmodified test of pure load/zext/store that is
not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll.
That's the reason for the logic difference to require the 'or'
instructions. The chances that vectorization would actually help a
memory-bound sequence like that seem small, but it looks nicer with:
vpmovzxwd (%rsi), %xmm0
vmovdqu %xmm0, (%rdi)
rather than:
movzwl (%rsi), %eax
movl %eax, (%rdi)
...
In the motivating test, we avoid creating a vector mess that is
unrecoverable in the backend, and SDAG forms the expected bswap
instructions after load combining:
movzbl (%rdi), %eax
vmovd %eax, %xmm0
movzbl 1(%rdi), %eax
vmovd %eax, %xmm1
movzbl 2(%rdi), %eax
vpinsrb $4, 4(%rdi), %xmm0, %xmm0
vpinsrb $8, 8(%rdi), %xmm0, %xmm0
vpinsrb $12, 12(%rdi), %xmm0, %xmm0
vmovd %eax, %xmm2
movzbl 3(%rdi), %eax
vpinsrb $1, 5(%rdi), %xmm1, %xmm1
vpinsrb $2, 9(%rdi), %xmm1, %xmm1
vpinsrb $3, 13(%rdi), %xmm1, %xmm1
vpslld $24, %xmm0, %xmm0
vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
vpslld $16, %xmm1, %xmm1
vpor %xmm0, %xmm1, %xmm0
vpinsrb $1, 6(%rdi), %xmm2, %xmm1
vmovd %eax, %xmm2
vpinsrb $2, 10(%rdi), %xmm1, %xmm1
vpinsrb $3, 14(%rdi), %xmm1, %xmm1
vpinsrb $1, 7(%rdi), %xmm2, %xmm2
vpinsrb $2, 11(%rdi), %xmm2, %xmm2
vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero
vpinsrb $3, 15(%rdi), %xmm2, %xmm2
vpslld $8, %xmm1, %xmm1
vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero
vpor %xmm2, %xmm1, %xmm1
vpor %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rsi)
movl (%rdi), %eax
movl 4(%rdi), %ecx
movl 8(%rdi), %edx
movbel %eax, (%rsi)
movbel %ecx, 4(%rsi)
movl 12(%rdi), %ecx
movbel %edx, 8(%rsi)
movbel %ecx, 12(%rsi)
Differential Revision: https://reviews.llvm.org/D78997
Summary:
Most of these checks were already implemented, and I just added references to
them to the code and tests. Also, much of this code was already
reviewed in the old flang/f18 GitHub repository, but I didn't get to
merge it before we switched repositories.
I implemented the check for C747 to not allow coarray components in derived
types that are of type C_PTR, C_FUNPTR, or type TEAM_TYPE.
I implemented the check for C748 that requires a data component whose type has
a coarray ultimate component to be a nonpointer, nonallocatable scalar and not
be a coarray.
I implemented the check for C750 that adds additional restrictions to the
bounds expressions of a derived type component that's an array.
These bounds expressions are sepcification expressions as defined in
10.1.11. There was already code in lib/Evaluate/check-expression.cpp to
check semantics for specification expressions, but it did not check for
the extra requirements of C750.
C750 prohibits specification functions, the intrinsic functions
ALLOCATED, ASSOCIATED, EXTENDS_TYPE_OF, PRESENT, and SAME_TYPE_AS. It
also requires every specification inquiry reference to be a constant
expression, and requires that the value of the bound not depend on the
value of a variable.
To implement these additional checks, I added code to the intrinsic proc
table to get the intrinsic class of a procedure. I also added an
enumeration to distinguish between specification expressions for
derived type component bounds versus for type parameters. I then
changed the code to pass an enumeration value to
"CheckSpecificationExpr()" to indicate that the expression was a bounds
expression and used this value to determine whether to emit an error
message when violations of C750 are found.
I changed the implementation of IsPureProcedure() to handle statement
functions and changed some references in the code that tested for the
PURE attribute to call IsPureProcedure().
I also fixed some unrelated tests that got new errors when I implemented these
new checks.
Reviewers: tskeith, DavidTruby, sscalpone
Subscribers: jfb, llvm-commits
Tags: #llvm, #flang
Differential Revision: https://reviews.llvm.org/D79263
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.
The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.
Reviewers: ABataev, jdoerfert, andwar
Reviewed By: jdoerfert
Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78969
This patch adds ORE for MachinePipeliner, so that people can anaylyze
their code using opt-viewer or other tools, then optimize the code to
catch more piplining opportunities.
Reviewed By: bcahoon
Differential Revision: https://reviews.llvm.org/D79368
getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions).
Followup to D78357
Reviewed By: @samparker
Differential Revision: https://reviews.llvm.org/D79341
Summary:
The RISC-V debug register was named dscratch in a previous draft of the RISC-V
debug mode spec. The number of registers has been increased to 2 in the latest
ratified version of the debug mode spec and the registers were named dscratch0
and dscratch1. We still support using the old register name "dscratch", but it
would be disassembled as "dscratch0" with this change.
Reviewers: apazos, asb, lenary, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78764
Summary:
This was added in https://reviews.llvm.org/D2449, but I'm not sure it's
necessary since an inalloca value is never a Constant (should be an
AllocaInst).
Reviewers: hans, rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79350
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.
Differential Revision: https://reviews.llvm.org/D79413
Size==0 triggers `assert(Size != 0)` in mapped_file_region::init.
I plan to use an empty file in D79339 (llvm-objcopy --dump-section).
According to POSIX, "If len is zero, mmap() shall fail and no mapping
shall be established." Just specialize case Size=0 to use
createInMemoryBuffer.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D79338
LSR has some logic that tries to aggressively reuse registers in
formula. This can lead to sub-optimal decision in complex loops where
the backend it trying to use shouldFavorPostInc. This disables the
re-use in those situations.
Differential Revision: https://reviews.llvm.org/D79301
VMEM soft clauses only contain VMEM and FLAT instructions. Teaching
GCNHazardRecognizer::checkSoftClauseHazards that other kinds of
instructions will naturally break the clause means there are far fewer
cases where it has to insert an s_nop instruction to forcibly break the
clause.
Differential Revision: https://reviews.llvm.org/D79353
Summary:
The new option allows the user to specify which file naming convention is used
by the source code ('llvm' or 'google').
Reviewers: gribozavr2
Subscribers: xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79380
This commit removes minor features of the test suite that I've never
seen used and that are basically just a maintenance burden:
- color_diagnostics: Diagnostics are colored by default when running
from a terminal, and not colored otherwise. This is the right behavior.
Being able to tweak this has minor value, and could be achieved by
modifying the %{compile_flags} instead if absolutely needed.
- ccache: This can be achieved by using a wrapper for the %{cxx}
substitution.
- _dump_macros_verbose is just a dead function now.
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.
Differential Revision: https://reviews.llvm.org/D79407
I'm not sure what the conventions are for this documentation. The
format seems limiting. I don't see how to refer to other flags, or
mark flags as deprecated. The rst I believe these generate seems to be
in source, and out of date.