Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45503
Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329808
C++ [over.built] p4:
"For every pair (T, VQ), where T is an arithmetic type other than bool, and VQ is either volatile or empty, there exist candidate operator functions of the form
VQ T& operator--(VQ T&);
T operator--(VQ T&, int);
"
The bool type is in position LastPromotedIntegralType in BuiltinOperatorOverloadBuilder::getArithmeticType::ArithmeticTypes, but addPlusPlusMinusMinusArithmeticOverloads() was expecting it at position 0.
Differential Revision: https://reviews.llvm.org/D44988
rdar://problem/34255516
llvm-svn: 329804
There are plenty of ways attaching can go wrong. Having the server
report the exact error means we can give better feedback to the user.
(This patch does not do the second part, it only makes sure the
information is sent from the server.)
Triggering all possible error conditions in a test would prove
challenging, but there is one error that is very easy to reproduce
(attempting to attach while debugging), so I write a test based on that.
The test immediately exposed a bug where the m_send_error_strings field
was being used uninitialized (so it was sometimes true from the get-go),
so I fix that as well.
llvm-svn: 329803
In r329691, we would choose FP even if the offset wouldn't fit, just
because the offset is smaller than the one from BP. This made many
accesses through FP need to scavenge a register, which resulted in
slower and bigger code for no good reason.
This patch now always picks the offset that fits first, even if FP is
preferred.
llvm-svn: 329797
This is a follow up of rL327695 to instruction select more variants of VSELGT
and VSELGE, for which it is necessary to custom lower SELECT.
More work is required in this area, which will be addressed soon:
- more variants need to be regression tested, but this depends on the next point.
- first LowerConstantFP need to be adjusted for fp16 values.
Differential Revision: https://reviews.llvm.org/D45205
llvm-svn: 329788
Summary:
"-std c++11" is not valid in compiler, we have to use "-std=c++11".
Test in vscode with this patch, code completion for header works as expected.
Reviewers: sammccall
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D45512
llvm-svn: 329786
Avoid storing duplicated "std::string"s.
clangd's global-symbol-builder takes 20+GB memory running across LLVM
repository. With this patch, the used memory is ~10GB (running on 48
threads, most of meory are AST-related).
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D45479
llvm-svn: 329784
Summary:
Merged 'tryMatchVectorRegister' (specific to Neon) and
'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and
created a generic 'parseVectorKind()' function that returns the #Elements
and ElementWidth of a vector suffix. This reduces the duplication of
this functionality between two the vector implementations.
This is patch [1/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.
Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro
Reviewed By: fhahn
Subscribers: tschuett, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D45427
llvm-svn: 329782
Since the range-based constraint manager (default) is weak in handling comparisons where symbols are on both sides it is wise to rearrange them to have symbols only on the left side. Thus e.g. A + n >= B + m becomes A - B >= m - n which enables the constraint manager to store a range m - n .. MAX_VALUE for the symbolic expression A - B. This can be used later to check whether e.g. A + k == B + l can be true, which is also rearranged to A - B == l - k so the constraint manager can check whether l - k is in the range (thus greater than or equal to m - n).
The restriction in this version is the the rearrangement happens only if both the symbols and the concrete integers are within the range [min/4 .. max/4] where min and max are the minimal and maximal values of their type.
The rearrangement is not enabled by default. It has to be enabled by using -analyzer-config aggressive-relational-comparison-simplification=true.
Co-author of this patch is Artem Dergachev (NoQ).
Differential Revision: https://reviews.llvm.org/D41938
llvm-svn: 329780
This was removed in D39932 but turned out this is actually needed
because runtimes such as compiler-rt and libc++ rely on common options
processing for setting certain flags such as -ffunction-sections and
-fdata-sections.
Differential Revision: https://reviews.llvm.org/D45507
llvm-svn: 329778
The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.
llvm-svn: 329774
Summary:
This patch implements the `-fxray-modes=` flag which allows users
building with XRay instrumentation to decide which modes to pre-package
into the binary being linked. The default is the status quo, which will
link all the available modes.
For this to work we're also breaking apart the mode implementations
(xray-fdr and xray-basic) from the main xray runtime. This gives more
granular control of which modes are pre-packaged, and picked from
clang's invocation.
This fixes llvm.org/PR37066.
Note that in the future, we may change the default for clang to only
contain the profiling implementation under development in D44620, when
that implementation is ready.
Reviewers: echristo, eizan, chandlerc
Reviewed By: echristo
Subscribers: mgorny, mgrang, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45474
llvm-svn: 329772
This avoids the need for a custom generated config file which is desired
because the custom config files differs per-target which means we cannot
reuse headers across different targets.
Differential Revision: https://reviews.llvm.org/D45304
llvm-svn: 329770
With -fno-plt, for example, calls to printf when getting converted to puts
still use the PLT. This patch checks for the metadata "RtLibUseGOT" and
annotates the declaration with the right attributes.
Differential Revision: https://reviews.llvm.org/D45180
llvm-svn: 329768
Author: Samuel Pitoiset
ds_read_b128 and ds_write_b128 have been recently enabled
under the amdgpu-ds128 option because the performance benefit
is unclear.
Though, using 128-bit loads/stores for the local address space
appears to introduce regressions in tessellation shaders. Not
sure what is broken, but as ds_read_b128/ds_write_b128 are not
enabled by default, just introduce a global option and enable
128-bit only if requested (until it's fixed/used correctly).
v2: - fix regressions in merge-stores.ll and multiple_tails.ll
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
llvm-svn: 329764
Summary:
When inserting MOVs to avoid Falkor HWPF collisions, the non-base
register operand of load instructions (e.g. a register offset) was not
being considered live, so it could potentially have been used as a
scratch register, clobbering the actual offset value.
Reviewers: mcrosier
Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D45502
llvm-svn: 329761
This check attempts to catch buggy uses of the `TEMP_FAILURE_RETRY`
macro, which is provided by both Bionic and glibc.
Differential Revision: https://reviews.llvm.org/D45059
llvm-svn: 329759
This is based on an example that was recently posted on llvm-dev:
void *propagate_null(void* b, int* g) {
if (!b) {
return 0;
}
(*g)++;
return b;
}
https://godbolt.org/g/xYk3qG
The original code or constant propagation in other passes has obscured the fact
that the phi can be removed completely.
Differential Revision: https://reviews.llvm.org/D45448
llvm-svn: 329755