As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit
Sparc, unlike `gcc` on Solaris. In a 1-stage build with `gcc`, only two
testcases are affected (currently `XFAIL`ed), while in a 2-stage build more
than 100 tests `FAIL` due to this issue.
The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit
Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike
with `clang`'s default of `-mcpu=v8`. This patch changes `clang` to use
`-mcpu=v9` on 32-bit Solaris/SPARC, too.
Doing so uncovered two bugs:
`clang -m32 -mcpu=v9` chokes with any Solaris system headers included:
/usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined"
#error "Both _ILP32 and _LP64 are defined"
While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9`
compilation, neither `gcc` nor Studio `cc` do. In fact, the Studio 12.6
`cc(1)` man page clearly states:
These predefinitions are valid in all modes:
[...]
__sparcv8 (SPARC)
__sparcv9 (SPARC -m64)
At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]`
for a 32-bit Sparc compilation with any V9 cpu. I've also changed
`MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle
Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic
Types, 3.1 Size and Alignment of Atomic C Types).
The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed
again.
Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D86621
This patch updates the documentation about `__builtin_memcpy_inline` and reorders the sections so it is more consitent and understandable.
Differential Revision: https://reviews.llvm.org/D87458
The tests have been updated and I plan to move them from the MSSA
directory up.
Some end-to-end tests needed small adjustments. One difference to the
legacy DSE is that legacy DSE also deletes trivially dead instructions
that are unrelated to memory operations. Because MemorySSA-backed DSE
just walks the MemorySSA, we only visit/check memory instructions. But
removing unrelated dead instructions is not really DSE's job and other
passes will clean up.
One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll,
but I think this comes down to legacy DSE not handling instructions that
may throw correctly in that case. To cover this with MemorySSA-backed
DSE, we need an update to llvm.coro.begin to treat it's return value to
belong to the same underlying object as the passed pointer.
There are some minor cases MemorySSA-backed DSE currently misses, e.g. related
to atomic operations, but I think those can be implemented after the switch.
This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html
For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores
goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers
and details in the thread on llvm-dev.
Impact on CTMark:
```
Legacy Pass Manager
exec instrs size-text
O3 + 0.60% - 0.27%
ReleaseThinLTO + 1.00% - 0.42%
ReleaseLTO-g. + 0.77% - 0.33%
RelThinLTO (link only) + 0.87% - 0.42%
RelLO-g (link only) + 0.78% - 0.33%
```
http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions
```
New Pass Manager
exec instrs. size-text
O3 + 0.95% - 0.25%
ReleaseThinLTO + 1.34% - 0.41%
ReleaseLTO-g. + 1.71% - 0.35%
RelThinLTO (link only) + 0.96% - 0.41%
RelLO-g (link only) + 2.21% - 0.35%
```
http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions
Reviewed By: asbirlea, xbolva00, nikic
Differential Revision: https://reviews.llvm.org/D87163
Currently AMDGPU does not support sanitizer. Disable
sanitizer options for now until they are supported.
Differential Revision: https://reviews.llvm.org/D87461
Instead of using CLANG_ENABLE_STATIC_ANALYZER for use of the
static analyzer in both clang and clang-tidy, add a second
toggle CLANG_TIDY_ENABLE_STATIC_ANALYZER.
This allows enabling the static analyzer in clang-tidy while
disabling it in clang.
Differential Revison: https://reviews.llvm.org/D87118
There are still plenty of tests that specify x86 as a triple but most shouldn't be doing anything very target specific - we can move any ones that I have missed on a case by case basis.
There are 2 reasons to remove strcasecmp and strncasecmp.
1) They are also modeled in CStringChecker and the related argumentum
contraints are checked there.
2) The argument constraints are checked in CStringChecker::evalCall.
This is fundamentally flawed, they should be checked in checkPreCall.
Even if we set up CStringChecker as a weak dependency for
StdLibraryFunctionsChecker then the latter reports the warning always.
Besides, CStringChecker fails to discover the constraint violation
before the call, so, its evalCall returns with `true` and then
StdCLibraryFunctions also tries to evaluate, this causes an assertion
in CheckerManager.
Either we fix CStringChecker to handle the call prerequisites in
checkPreCall, or we must not evaluate any pure functions in
StdCLibraryFunctions that are also handled in CStringChecker.
We do the latter in this patch.
Differential Revision: https://reviews.llvm.org/D87239
Basic block sections is untested on other platforms and binary formats apart
from x86,elf. This patch emits a warning and drops the flag if the platform
and binary format are not compatible. Add a test to ensure that
specifying an incompatible target in the driver does not enable the
feature.
Differential Revision: https://reviews.llvm.org/D87426
This is the initial part of the implementation of the C++20 likelihood
attributes. It handles the attributes in an if statement.
Differential Revision: https://reviews.llvm.org/D85091
In standard C library, both rint and nearbyint returns rounding result
in current rounding mode. But nearbyint never raises inexact exception.
On PowerPC, x(v|s)r(d|s)pic may modify FPSCR XX, raising inexact
exception. So we can't select constrained fnearbyint into xvrdpic.
One exception here is xsrqpi, which will not raise inexact exception, so
fnearbyint f128 is okay here.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D87220
Extract a simple check to check if a `RecordDecl` is a `CFError` Decl.
This is a simple refactoring to prepare for an upcoming change. NFC.
Patch is extracted from
8afaf3aad2.
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
short a;
int b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.
The AAPCS Release 2020Q2
(https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=)
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.
```
Patch D16586 was updated to follow such behavior by verifying that we
only change volatile bit-field access when:
- it won't overlap with any other non-bit-field member
- we only access memory inside the bounds of the record
- avoid overlapping zero-length bit-fields.
Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.
Differential Revision: https://reviews.llvm.org/D72932
The following people contributed to this patch:
- Diogo Sampaio
- Ties Stuij
In some situation shifts can be treated as a template, and is thus formatted as one. So, by doing a couple extra checks to assure that the condition doesn't contain a template, and is in fact a bit shift should solve this problem.
This is a fix for [[ https://bugs.llvm.org/show_bug.cgi?id=46969 | bug 46969 ]]
Reviewed By: MyDeveloperDay
Patch By: Saldivarcher
Differential Revision: https://reviews.llvm.org/D86581
Change capitalization of some names due to LLVM naming rules.
Change names of some variables to make them more speaking.
Rework similar bug reports into one common function.
Prepare code for the next patches to reduce unrelated changes.
Differential Revision: https://reviews.llvm.org/D87138
LLVM has bumped the minimum required CMake version to 3.13.4, so this has become dead code.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D87189
Discussed with @craig.topper and @spatel - this is to try and tidyup the codegen folder and move the x86 specific tests (as opposed to general tests that just happen to use x86 triples) into subfolders. Its up to other targets if they follow suit.
It also helps speed up test iterations as using wildcards on lit commands often misses some filenames.
Fixes issue noticed by static analysis where we have a copy+paste typo, testing ScheduleKind.M1 twice instead of ScheduleKind.M2.
Differential Revision: https://reviews.llvm.org/D87250
* Do not visit `CXXDefaultArgExpr`
* To build `CallArguments` nodes, just go through non-default arguments
Differential Revision: https://reviews.llvm.org/D87249
MSVC's cl.exe has a few command line arguments which start with -M such
as "-MD", "-MDd", "-MT", "-MTd", "-MP".
These arguments are not dependency file generation related, and these
arguments were being removed by getClangStripDependencyFileAdjuster()
which was wrong.
Differential revision: https://reviews.llvm.org/D86999
We want the generice StdLibraryFunctionsChecker to report only if there
are no specific checkers that would handle the argument constraint for a
function.
Note, the assumptions are still evaluated, even if the arguement
constraint checker is set to not report. This means that the assumptions
made in the generic StdLibraryFunctionsChecker should be an
over-approximation of the assumptions made in the specific checkers. But
most importantly, the assumptions should not contradict.
Differential Revision: https://reviews.llvm.org/D87240
We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics.
This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern.
Differential Revision: https://reviews.llvm.org/D87101
Decl::dump is primarily used for debugging to visualise the current state of a
declaration. Usually Decl::dump just displays the current state of the Decl and
doesn't actually change any of its state, however since commit
457226e02a the method actually started loading
additional declarations from the ExternalASTSource. This causes that calling
Decl::dump during a debugging session now actually does permanent changes to the
AST and will cause the debugged program run to deviate from the original run.
The change that caused this behaviour is the addition of
`hasConstexprDestructor` (which is called from the TextNodeDumper) which
performs a lookup into the current CXXRecordDecl to find the destructor. All
other similar methods just return their respective bit in the DefinitionData
(which obviously doesn't have such side effects).
This just changes the node printer to emit "unknown_constexpr" in case a
CXXRecordDecl is dumped that could potentially call into the ExternalASTSource
instead of the usually empty string/"constexpr". For CXXRecordDecls that can
safely be dumped the old behaviour is preserved
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D80878
This change groups
* Rename: `ignoreParenBaseCasts` -> `IgnoreParenBaseCasts` for uniformity
* Rename: `IgnoreConversionOperator` -> `IgnoreConversionOperatorSingleStep` for uniformity
* Inline `IgnoreNoopCastsSingleStep` into a lambda inside `IgnoreNoopCasts`
* Refactor `IgnoreUnlessSpelledInSource` to make adequate use of `IgnoreExprNodes`
Differential Revision: https://reviews.llvm.org/D86880
Rationale:
This allows users to use `IgnoreExprNodes` and `Ignore*SingleStep` outside of
`clang/AST/Expr.cpp`.
Minor:
Rename `IgnoreImp...SingleStep` into `IgnoreImplicit...SingleStep`.
Differential Revision: https://reviews.llvm.org/D86778
When using the always break after return type setting:
Before:
SomeType funcdecl(LIST(uint64_t));
After:
SomeType
funcdecl(LIST(uint64_t));"
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D87007
Before: _Atomic(uint64_t) * a;
After: _Atomic(uint64_t) *a;
This treats _Atomic the same as the the TypenameMacros and decltype. It
also allows some cleanup by removing checks whether the token before a
paren is kw_decltype and instead checking for TT_TypeDeclarationParen.
While touching this code also extend the decltype test cases to also check
for typeof() and _Atomic(T).
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86959
This adds a `AttributeMacros` configuration option that causes certain
identifiers to be parsed like a __attribute__((foo)) annotation.
This is motivated by our CHERI C/C++ fork which adds a __capability
qualifier for pointer/reference. Without this change clang-format parses
many type declarations as multiplications/bitwise-and instead.
I initially considered adding "__capability" as a new clang-format keyword,
but having a list of macros that should be treated as attributes is more
flexible since it can be used e.g. for static analyzer annotations or other language
extensions.
Example: std::vector<foo * __capability> -> std::vector<foo *__capability>
Depends on D86775 (to apply cleanly)
Reviewed By: MyDeveloperDay, jrtc27
Differential Revision: https://reviews.llvm.org/D86782
Seems '${cmake_2_8_12_PRIVATE}' was removed a long time ago, so it should
be just PRIVATE keyword here.
Reviewed By: john.brawn
Differential Revision: https://reviews.llvm.org/D86091
send_patched_file decodes with utf-8.
The default encoder for python 2 is ascii.
So it is necessary to also change send_string to use utf-8.
Differential Revision: https://reviews.llvm.org/D83984
This patch implements the vec_expandm function prototypes in altivec.h in order
to utilize the vector expand with mask instructions introduced in Power10.
Differential Revision: https://reviews.llvm.org/D82727
They are for more powerful than the current documentation implies, this
adds
* adopting a lock,
* deferring a lock,
* manually unlocking the scoped capability,
* relocking the scoped capability, possibly in a different mode,
* try-relocking the scoped capability.
Also there is now a generic explanation how attributes on scoped
capabilities work. There has been confusion in the past about how to
annotate them (see e.g. PR33504), hopefully this clears things up.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D87066
The old locking attributes had a generic release, but as it turns out
the capability-based attributes have it as well.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D87064
Recent versions of python3's multiprocessing module will blow up with
a Runtime error from this code, saying:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase
This is becuae the wrappers in bin/ are not using the `__name__ == "__main__"` idiom correctly.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D87051
After adding a field of one bit, the bitfield members would take
30+1+1+1 = 33 bits, causing the size of TemplateParameterList to
increase from 16 to 24 bytes on 64-bit systems.
With 29 bits for NumParams we can encode up to half a billion template
parameters, which is almost certainly still enough for anybody.
Instead of just mutex members we also consider mutex globals.
Unsurprisingly they are always in scope. Now the paper [1] says that
> The scope of a class member is assumed to be its enclosing class,
> while the scope of a global variable is the translation unit in
> which it is defined.
But I don't think we should limit this to TUs where a definition is
available - a declaration is enough to acquire the mutex, and if a mutex
is really limited in scope to a translation unit, it should probably be
only declared there.
[1] https://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/42958.pdf
Fixes PR46354.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D84604
When parsing a C++17 binding declaration, we first create the
BindingDecls in Sema::ActOnDecompositionDeclarator, and then build the
DecompositionDecl in Sema::ActOnVariableDeclarator, so the contained
BindingDecls are never null. But when deserializing, we read the
DecompositionDecl with all properties before filling in the Bindings.
Among other things, reading a declaration reads whether it's invalid,
then calling setInvalidDecl which assumes that all bindings of the
DecompositionDecl are available, but that isn't the case.
Deserialization should just set all properties directly without invoking
subsequent functions, so we just set the flag without using the setter.
Fixes PR34960.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D86207
I don't think this is obvious, since try-acquire seemingly contradicts
our usual requirements of "no conditional locking".
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D87065
These overloads are listed in appendix A of the ELFv2 ABI specification
without a requirement for ISA 3.0. So these need to be available on
all Altivec-capable architectures. The implementation in altivec.h
erroneously had them guarded for Power9 due to the availability of
the VCMPNE[BHW] instructions. However these need to be implemented
in terms of the VCMPEQ[BHW] instructions on older architectures.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=47423
With 6a75496836, these two options are no longer
forwarded to GCC. This patch restores the original behavior.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D87162
The load builtins in altivec.h do not have const in the signature
for the pointer parameter. This prevents using them for loading
from constant pointers. A notable case for such a use is Eigen.
This patch simply adds the missing const.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=47408
Currently a test failure always reports a line number inside verifyFormat()
which is not very helpful to see which test failed. With this change we now
emit the line number where the verify function was called. When using an
IDE such as CLion, the output now includes a clickable link that points to
the call site.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86926
Currently clang-format starts overriding the default values at index 0
(keeping the existing values) instead of appending or replacing all values.
This patch simply checks the current (IMO surprising) behaviour and does
not attempt to change it.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86941
While parsing LateParsedTemplates, Clang assumes that the Global DeclID matches
with the Local DeclID of a Decl. This is not the case when we have multiple
dependent modules , each having their own LateParsedTemplate section. In such a
case, a Local/Global DeclID confusion occurs which leads to improper casting of
FunctionDecl's.
This commit creates a Vector to map the LateParsedTemplate section of each
Module with their module file and therefore resolving the Global/Local DeclID
confusion.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D86514
The "restrict" keyword is illegal in C++, however, many libc
implementations use the "__restrict" compiler intrinsic in functions
prototypes. The "__restrict" keyword qualifies a type as a restricted type
even in C++.
In case of any non-C99 languages, we don't want to match based on the
restrict qualifier because we cannot know if the given libc implementation
qualifies the paramter type or not.
Differential Revision: https://reviews.llvm.org/D87097
This change implements pragma STDC FENV_ROUND, which is introduced by
the extension to standard (TS 18661-1). The pragma is implemented only
in frontend, it sets apprpriate state of FPOptions stored in Sema. Use
of these bits in constant evaluation adn/or code generator is not in the
scope of this change.
Parser issues warning on unsuppored pragma when it encounteres pragma
STDC FENV_ROUND, however it makes syntax checks and updates Sema state
as if the pragma were supported.
Primary purpose of the partial implementation is to facilitate
development of non-default floating poin environment. Previously a
developer cannot set non-default rounding mode in sources, this mades
preparing tests for say constant evaluation substantially complicated.
Differential Revision: https://reviews.llvm.org/D86921
This adds the size to forward declared class DITypes, if the size is known.
Fixes an issue where we determine whether to emit fragments based on the
type size, so fragments would sometimes be incorrectly emitted if there
was no size.
Bug: https://bugs.llvm.org/show_bug.cgi?id=47338
Differential Revision: https://reviews.llvm.org/D87062
Previously we had two overloads where the only real difference beyond
parameter order was whether a reference parameter is const, where one
overload treated the reference parameter as an in-parameter and the
other treated it as an out-parameter!
Previously, this code discarded the result of CheckPlaceholderExpr for
non-matrix subexpressions. Not only is this wasteful, but it was creating a
Warc-repeated-use-of-weak false-positive on the attached testcase, since the
discarded expression was still registered as a use of the weak property.
rdar://66162246
Differential revision: https://reviews.llvm.org/D87102
The new overloads apply directly to a node, like the
`clang::ast_matchers::match` functions, Rather than generating an
`EditGenerator` combinator.
Differential Revision: https://reviews.llvm.org/D87031
The __ARM_FEATURE_SVE_BITS feature macro is specified in the Arm C
Language Extensions (ACLE) for SVE [1] (version 00bet5). From the spec,
where __ARM_FEATURE_SVE_BITS==N:
When N is nonzero, indicates that the implementation is generating
code for an N-bit SVE target and that the arm_sve_vector_bits(N)
attribute is available.
This was defined in D83550 as __ARM_FEATURE_SVE_BITS_EXPERIMENTAL and
enabled under the -msve-vector-bits flag to simplify initial tests.
This patch drops _EXPERIMENTAL now there is support for the feature.
[1] https://developer.arm.com/documentation/100987/latest
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D86720
We recently added support for -mtune. This patch adds /tune: so we can specify the tune CPU from clang-cl. MSVC doesn't support this but icc does.
Differential Revision: https://reviews.llvm.org/D86820
cache for implicit modules.
The ModuleManager's use of FileEntry nodes as the keys for its map of
loaded modules is less than ideal. Uniqueness for FileEntry nodes is
maintained by FileManager, which in turn uses inode numbers on hosts
that support that. When coupled with the module cache's proclivity for
turning over and deleting stale PCMs, this means entries for different
module files can wind up reusing the same underlying inode. When this
happens, subsequent accesses to the Modules map will disagree on the
ModuleFile associated with a given file.
In general, it is not sufficient to resolve this conundrum with a type
like FileEntryRef that stores the name of the FileEntry node on first
access because of path canonicalization issues. However, the paths
constructed for implicit module builds are fully under Clang's
control. We *can*, therefore, rely on their structure being consistent
across operating systems and across subsequent accesses to the Modules
map.
To mitigate the effects of inode reuse, perform an extra name check when
implicit modules are returned from the cache. This has the effect of
forcing reused FileEntry nodes to stomp over existing-but-stale entries
in the cache, which simulates a miss - exactly the desired behavior.
rdar://48443680
Patch by Robert Widmann!
Differential Revision: https://reviews.llvm.org/D86823
Temporarily revert commit 04abbb3a78
due to regressions in some HIP apps due backend issues revealed by
this change.
Will re-commit it when backend issues are fixed.
This patch restores the default traversal for Transformer's `makeRule` to
`TK_AsIs`. The implicit mode has proven problematic.
Differential Revision: https://reviews.llvm.org/D87048
This patch implements the builtins for Vector Multiply Builtins (vmulxxd family of instructions), and adds the appropriate test cases for these builtins. The builtins utilize the vector multiply instructions itnroduced with ISA 3.1.
Differential Revision: https://reviews.llvm.org/D83955