Summary:
Specifically, we upgrade llvm.nvvm.:
* brev{32,64}
* clz.{i,ll}
* popc.{i,ll}
* abs.{i,ll}
* {min,max}.{i,ll,u,ull}
* h2f
These either map directly to an existing LLVM target-generic
intrinsic or map to a simple LLVM target-generic idiom.
In all cases, we check that the code we generate is lowered to PTX as we
expect.
These builtins don't need to be backfilled in clang: They're not
accessible to user code from nvcc.
Reviewers: tra
Subscribers: majnemer, cfe-commits, llvm-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D28793
llvm-svn: 292694
For a << b (as original vec_sl does), if b >= sizeof(a) * 8, the
behavior is undefined. However, Power instructions do define the
behavior, which is equivalent to a << (b % (sizeof(a) * 8)).
This patch changes altivec.h to use a << (b % (sizeof(a) * 8)), to
ensure the consistent semantic of the instructions. Then it combines
the generated multiple instructions back to a single shift.
This patch handles left shift only. Right shift, on the other hand, is
more complicated, considering arithematic/logical right shift.
Differential Revision: https://reviews.llvm.org/D28037
llvm-svn: 292659
This commit improves the mismatched destructor type error by detecting when the
destructor call has used a '.' instead of a '->' on a pointer to the destructed
type. The diagnostic now suggests to use '->' instead of '.', and adds a fixit
where appropriate.
rdar://28766702
Differential Revision: https://reviews.llvm.org/D25817
llvm-svn: 292615
Summary:
rL292562 added a fix to always format if the fallback style is set to "none".
In test/Format/style-on-command-line.cpp:19 is redundant, since -fallback-style
has a default value of LLVM set in ClangFormat.cpp:72.
@amaiorano: I believe that the rest of the test cases still cover your change in
case the fallback style is explicitly set to "none". Please, if this is not the
case, initiate a discussion.
Reviewers: ioeric, bkramer
Reviewed By: ioeric
Subscribers: cfe-commits, klimek, amaiorano
Differential Revision: https://reviews.llvm.org/D28943
llvm-svn: 292604
Summary:
It seems that rL292518 introduced a RUN: false, but the continuation rL292545
forgot to remove it back.
This has flown under the radar, because it's a long test and doesn't get
executed by default during sanity testing.
To test:
$ cd llvm_build
$ ./bin/llvm-lit --param run_long_tests=true tools/clang/test/Driver/response-file.c
@rsmith: have a look if this change is OK please.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: cfe-commits, rsmith
Differential Revision: https://reviews.llvm.org/D28941
llvm-svn: 292600
Summary: Instead of picking the buffer file coding system, always use utf-8-unix for communicating with clang-format. This is fine because clang-format never actually reads the file to be formatted, only standard input. This is a bit simpler (process coding system is now a constant) and potentially faster, as utf-8-unix is Emacs's internal coding system. Also add an end-to-end test that actually invokes clang-format.
Reviewers: klimek
Reviewed By: klimek
Differential Revision: https://reviews.llvm.org/D28904
llvm-svn: 292593
with SEH and openmp
In some cituations (during codegen for Windows SEH constructs)
CodeGenFunction instance may have CurFn equal to nullptr. OpenMP related
code does not expect such situation during cleanup.
llvm-svn: 292590
This fixes clang-format not formatting if fallback-style is explicitly set to
"none", and either a config file is found or YAML is passed in without a
"BasedOnStyle". With this change, passing "none" in these cases will have no
affect, and LLVM style will be used as the base style.
Differential Revision: https://reviews.llvm.org/D28844
llvm-svn: 292562
by providing a memchr builtin that returns char* instead of void*.
Also add a __has_feature flag to indicate the presence of constexpr forms of
the relevant <string> functions.
llvm-svn: 292555
Under this defect resolution, the injected-class-name of a class or class
template cannot be used except in very limited circumstances (when declaring a
constructor, in a nested-name-specifier, in a base-specifier, or in an
elaborated-type-specifier). This is apparently done to make parsing easier, but
it's a pain for us since we don't know whether a template-id using the
injected-class-name is valid at the point when we annotate it (we don't yet
know whether the template-id will become part of an elaborated-type-specifier).
As a tentative resolution to a perceived language defect, mem-initializer-ids
are added to the list of exceptions here (they generally follow the same rules
as base-specifiers).
When the reference to the injected-class-name uses the 'typename' or 'template'
keywords, we permit it to be used to name a type or template as an extension;
other compilers also accept some cases in this area. There are also a couple of
corner cases with dependent template names that we do not yet diagnose, but
which will also get this treatment.
llvm-svn: 292518
Summary:
The warning doesn't know why the variable was looked up but not
odr-used, so reword it to not claim that it was used in an unevaluated
context.
Reviewers: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28902
llvm-svn: 292498
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.
Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html
Reviewers: aaron.ballman, klimek, alexfh
Subscribers: ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D28850
llvm-svn: 292477
Summary:
SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted:
* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)
The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):
-gmlt(orig) -gmlt(patched) -g
433.milc 4.68% 5.40% 19.73%
444.namd 8.45% 8.93% 45.99%
447.dealII 97.43% 115.21% 374.89%
450.soplex 27.75% 31.88% 126.04%
453.povray 21.81% 26.16% 92.03%
470.lbm 0.60% 0.67% 1.96%
482.sphinx3 5.77% 6.47% 26.17%
400.perlbench 17.81% 19.43% 73.08%
401.bzip2 3.73% 3.92% 12.18%
403.gcc 31.75% 34.48% 122.75%
429.mcf 0.78% 0.88% 3.89%
445.gobmk 6.08% 7.92% 42.27%
456.hmmer 10.36% 11.25% 35.23%
458.sjeng 5.08% 5.42% 14.36%
462.libquantum 1.71% 1.96% 6.36%
464.h264ref 15.61% 16.56% 43.92%
471.omnetpp 11.93% 15.84% 60.09%
473.astar 3.11% 3.69% 14.18%
483.xalancbmk 56.29% 81.63% 353.22%
geomean 15.60% 18.30% 57.81%
Debug info size change for -gmlt binary with this patch:
433.milc 13.46%
444.namd 5.35%
447.dealII 18.21%
450.soplex 14.68%
453.povray 19.65%
470.lbm 6.03%
482.sphinx3 11.21%
400.perlbench 8.91%
401.bzip2 4.41%
403.gcc 8.56%
429.mcf 8.24%
445.gobmk 29.47%
456.hmmer 8.19%
458.sjeng 6.05%
462.libquantum 11.23%
464.h264ref 5.93%
471.omnetpp 31.89%
473.astar 16.20%
483.xalancbmk 44.62%
geomean 16.83%
Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo
Reviewed By: dblaikie, echristo
Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25435
llvm-svn: 292458
In ThinLTO mode, type metadata will require the module to be written as a
multi-module bitcode file, which is currently incompatible with the Darwin
linker. It is also useful to be able to enable or disable multi-module bitcode
for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit,
which is used by the driver to enable multi-module bitcode on all but
Darwin+ThinLTO, and can also be used to enable/disable the feature manually.
Differential Revision: https://reviews.llvm.org/D28877
llvm-svn: 292448
Introduced in r181561 - it may've been subsumed by work done to allow
emission of declarations for vtable types while still emitting some of
their member functions correctly for those declarations. Whatever the
reason, the tests pass without this code now.
llvm-svn: 292439
The if-clause on the combined directive potentially applies to both the
'target' and the 'parallel' regions. Codegen'ing the if-clause on the
combined directive requires additional support because the expression in
the clause must be captured by the 'target' capture statement but not
the 'parallel' capture statement. Note that this situation arises for
other clauses such as num_threads.
The OMPIfClause class inherits OMPClauseWithPreInit to support capturing
of expressions in the clause. A member CaptureRegion is added to
OMPClauseWithPreInit to indicate which captured statement (in this case
'target' but not 'parallel') captures these expressions.
To ensure correct codegen of captured expressions in the presence of
combined 'target' directives, OMPParallelScope was added to 'parallel'
codegen.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28781
llvm-svn: 292437
Summary:
Add a callback from ASTReader to DeserializationListener when the former
reads an IMPORTED_MODULES block. This supports Swift in using PCH for
bridging headers.
Reviewers: doug.gregor, manmanren, bruno
Reviewed By: manmanren
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28779
llvm-svn: 292436
Summary:
Code committed in rL290219 went through a few iterations; test wound up with
stale comment.
Reviewers: doug.gregor, manmanren
Reviewed By: manmanren
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28790
llvm-svn: 292435
This patch adds codegen for the 'target parallel' directive on the NVPTX
device. We term offload OpenMP directives such as 'target parallel' and
'target teams distribute parallel for' as SPMD constructs. SPMD constructs,
in contrast to Generic ones like the plain 'target', can never contain
a serial region.
SPMD constructs can be handled more efficiently on the GPU and do not
require the Warp Loop of the Generic codegen scheme. This patch adds
SPMD codegen support for 'target parallel' on the NVPTX device and can
be reused for other SPMD constructs.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28755
llvm-svn: 292428
This rule permits the injected-class-name of a class template to be used as
both a template type argument and a template template argument, with no extra
syntax required to disambiguate.
llvm-svn: 292426
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292419
This is just wasted space, we don't support state points from multiple
source managers. Validate that there's no state when resetting the
source manager and use the 'global' reference to the sourcemanager
instead of the ones in the diag state.
llvm-svn: 292402
The idea for this originated from a really tricky bug: ISRs on ARM don't
automatically save off the VFP regs, so if say, memcpy gets interrupted and the
ISR itself calls memcpy, the regs are left clobbered when the ISR is done.
https://reviews.llvm.org/D28820
llvm-svn: 292375
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292374
These two are independent: it's possible to use LLD without LTO,
and it's possible to do LTO build without LLD.
Differential Revision: https://reviews.llvm.org/D28821
llvm-svn: 292343