CallGraph previously would just show the normal name of a function,
which gets really confusing when using it on large C++ projects. This
patch switches the printName call to a printQualifiedName, so that the
namespaces are included.
Change-Id: Ie086d863f6b2251be92109ea1b0946825b28b49a
llvm-svn: 348950
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.
#pragma clang loop unroll_and_jam(enable)
#pragma clang loop distribute(enable)
is the same as
#pragma clang loop distribute(enable)
#pragma clang loop unroll_and_jam(enable)
and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.
This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,
!0 = !{!0, !1, !2}
!1 = !{!"llvm.loop.unroll_and_jam.enable"}
!2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
!3 = !{!"llvm.loop.distribute.enable"}
defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.
Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.
For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.
Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.
To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.
With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).
Reviewed By: hfinkel, dmgreen
Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288
llvm-svn: 348944
Clang's CallGraph analysis doesn't use the RecursiveASTVisitor's setting
togo into template instantiations. The result is that anything wanting
to do call graph analysis ends up missing any template function calls.
Change-Id: Ib4af44ed59f15d43f37af91622a203146a3c3189
llvm-svn: 348942
The Darwin targets use `int64_t` and `uint64_t` to define the `int_least64_t`
and `int_fast64_t` types. The underlying type is actually a `long long`. Match
the types to allow the printf specifiers to work properly and have the compiler
vended macros match the implementation on the target.
llvm-svn: 348939
Summary:
`memchr` and `memcmp` operate upon the character units of the object
representation; that is, the `size_t` parameter expresses the number of
character units. The constant folding implementation is updated in this
patch to account for multibyte element types in the arrays passed to
`memchr`/`memcmp` and, in the case of `memcmp`, to account for the
possibility that the arrays may have differing element types (even when
they are byte-sized).
Actual inspection of the object representation is not implemented.
Comparisons are done only between elements with the same object size;
that is, `memchr` will fail when inspecting at least one character unit
of a multibyte element. The integer types are assumed to have two's
complement representation with 0 for `false`, 1 for `true`, and no
padding bits.
`memcmp` on multibyte elements will only be able to fold in cases where
enough elements are equal for the answer to be 0.
Various tests are added to guard against incorrect folding for cases
that miscompile on some system or other prior to this patch. At the same
time, the unsigned 32-bit `wchar_t` testing in
`test/SemaCXX/constexpr-string.cpp` is restored.
Reviewers: rsmith, aaron.ballman, hfinkel
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D55510
llvm-svn: 348938
Summary:
Added support for the -gline-directives-only option + fixed logic of the
debug info for CUDA devices. If optimization level is O0, then options
--[no-]cuda-noopt-device-debug do not affect the debug info level. If
the optimization level is >O0, debug info options are used +
--no-cuda-noopt-device-debug is used or no --cuda-noopt-device-debug is
used, the optimization level for the device code is kept and the
emission of the debug directives is used.
If the opt level is > O0, debug info is requested +
--cuda-noopt-device-debug option is used, the optimization is disabled
for the device code + required debug info is emitted.
Reviewers: tra, echristo
Subscribers: aprantl, guansong, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D51554
llvm-svn: 348930
This library was breaking my -DBUILD_SHARED_LIBS=1 build. rC348915 seemed to miss this case.
As this seems an "obvious" fix, I am committing without pre-commit review as
per the LLVM developer policy.
llvm-svn: 348929
Address spaces are cast into generic before invoking the constructor.
Added support for a trailing Qualifiers object in FunctionProtoType.
Differential Revision: https://reviews.llvm.org/D54862
llvm-svn: 348927
This is a more thorough fix of rC348911.
The story about -DBUILD_SHARED_LIBS=on build after rC348907 (Move PCHContainerOperations from Frontend to Serialization) is:
1. libclangSerialization.so defines PCHContainerReader dtor, ...
2. clangFrontend and clangTooling define classes inheriting from PCHContainerReader, thus their DSOs have undefined references on PCHContainerReader dtor
3. Components depending on either clangFrontend or clangTooling cannot be linked unless they have explicit dependency on clangSerialization due to the default linker option -z defs. The explicit dependency could be avoided if libclang{Frontend,Tooling}.so had these undefined references.
This patch adds the explicit dependency on clangSerialization to make them build.
llvm-svn: 348915
As reported in PR39946, these two implementations cause stack overflows
to occur when a type recursively contains itself. While this only
happens when an incomplete version of itself is used by membership (and
thus an otherwise invalid program), the crashes might be surprising.
The solution here is to replace the recursive implementation with one
that uses a std::vector as a queue. Old values are kept around to
prevent re-checking already checked types.
Change-Id: I582bb27147104763d7daefcfee39d91f408b9fa8
llvm-svn: 348899
The AST matcher documentation dumping script was being a bit over-zealous about stripping comment markers, which ended up causing comments in example code to stop being comments. Fix that by only stripping comments at the start of a line, rather than removing any forward slash (which also impacts prose text).
llvm-svn: 348891
Only explicitly look through integer and floating-point promotion where the result type is actually a promotion, which is not always the case for bit-fields in C.
llvm-svn: 348889
- explicit_bzero has limited scope/usage only for security/crypto purposes but is non-optimisable version of memset/0 and bzero.
- explicit_memset has similar signature and semantics as memset but is also a non-optimisable version.
Reviewers: NoQ
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D54592
llvm-svn: 348884
for the DICompileUnit.
This addresses post-commit feedback for D55085. Without this patch, a
main source file with an absolute paths may appear in different
DIFiles, once with the absolute path and once with the common prefix
between the absolute path and the current working directory.
Differential Revision: https://reviews.llvm.org/D55519
llvm-svn: 348865
Summary:
ASan does not support statically linked binaries, but ASan runtime itself can
be statically linked into a target binary executable.
Reviewers: eugenis, kcc
Reviewed By: eugenis
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D55066
llvm-svn: 348863
Memoization dose not seem to be necessary, as other statement visitors
run just fine without it,
and in fact seems to be causing memory corruptions.
Just removing it instead of investigating the root cause.
rdar://45945002
Differential Revision: https://reviews.llvm.org/D54921
llvm-svn: 348822
This is currently a diagnostics, but might be upgraded to an error in the future,
especially if we introduce os_return_on_success attributes.
rdar://46359592
Differential Revision: https://reviews.llvm.org/D55530
llvm-svn: 348820
Summary: Don't add a child just for the label.
Reviewers: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D55495
llvm-svn: 348794
Implement support for try-catch blocks in constexpr functions, as
proposed in http://wg21.link/P1002 and voted in San Diego for c++20.
The idea is that we can still never throw inside constexpr, so the catch
block is never entered. A try-catch block like this:
try { f(); } catch (...) { }
is then morally equivalent to just
{ f(); }
Same idea should apply for function/constructor try blocks.
rdar://problem/45530773
Differential Revision: https://reviews.llvm.org/D55097
llvm-svn: 348789
Summary:
SSE2 vectorization was added in 2012, but it is 2018 now and I can't
observe any performance boost (testing clang -E [all Sema/* CodeGen/* with proper -I options]) with the existing _mm_movemask_epi8+countTrailingZeros or the following SSE4.2 (compiling with -msse4.2):
__m128i C = _mm_setr_epi8('\r','\n',0,0,0,0,0,0,0,0,0,0,0,0,0,0);
_mm_cmpestri(C, 2, Chunk, 16, _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_ANY | _SIDD_POSITIVE_POLARITY | _SIDD_LEAST_SIGNIFICANT)
Delete the vectorization to simplify the code.
Also simplify the code a bit and don't check the line ending sequence \n\r
Reviewers: bkramer, #clang
Reviewed By: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D55484
llvm-svn: 348777
Use zip_longest in two locations that compare iterator ranges.
zip_longest allows the iteration using a range-based for-loop and to be
symmetric over both ranges instead of prioritizing one over the other.
In that latter case code have to handle the case that the first is
longer than the second, the second is longer than the first, and both
are of the same length, which must partially be checked after the loop.
With zip_longest, this becomes an element comparison within the loop
like the comparison of the elements themselves. The symmetry makes it
clearer that neither the first and second iterators are handled
differently. The iterators are not event used directly anymore, just
the ranges.
Differential Revision: https://reviews.llvm.org/D55468
llvm-svn: 348762