Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.
The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 .
Reviewers: craig.topper, niravd, spatel, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42646
llvm-svn: 323690
Prior to committing r323681, we decided to change pick() to identity() since
it wasn't clear from the name what pick() did. However, identity() isn't a very
good name either since it implies that no changes are made. For some reason,
naming it changeTo() didn't occur to me until just after the commit. This
should resolve the lack of clarity that pick() had while still implying that
it changes the MIR.
llvm-svn: 323689
Rafael pointed out that `hasInternalLinkage() || hasPrivateLinkage()` is
equivalent to `hasLocalLinkage()` in post-commit review.
I'm intentionally not updating the comment, partly because I like it
being explicit, and partly because "global symbols with local linkage"
sounds like an oxymoron.
llvm-svn: 323688
Summary:
The existing unit tests in FormatTestObjC.cpp didn't fully cover
all the cases for protocol confirmance list formatting.
This extends the unit tests to more cases of protocol
conformance list formatting, especially how the behavior changes
when `BinPackParameters` changes from `true` (the default) to `false`.
Test Plan: make -j12 FormatTests && \
./tools/clang/unittests/Format/FormatTests --gtest_filter=FormatTestObjC.\*
Reviewers: krasimir, jolesiak, stephanemoore
Reviewed By: krasimir
Subscribers: benhamilton, klimek, cfe-commits, hokein, Wizard
Differential Revision: https://reviews.llvm.org/D42649
llvm-svn: 323684
This reverts commit r322917 due to multiple performance regressions in spec2006
and spec2017. XFAILed llvm/test/CodeGen/AArch64/big-callframe.ll which initially
motivated this change.
llvm-svn: 323683
Summary:
As discussed in D42244, we have difficulty describing the legality of some
operations. We're not able to specify relationships between types.
For example, declaring the following
setAction({..., 0, s32}, Legal)
setAction({..., 0, s64}, Legal)
setAction({..., 1, s32}, Legal)
setAction({..., 1, s64}, Legal)
currently declares these type combinations as legal:
{s32, s32}
{s64, s32}
{s32, s64}
{s64, s64}
but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES have relationships between the types that are currently
described incorrectly.
Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).
It's also difficult to control the legalization strategy. We've added support
for legalizing non-power of 2 types but there's still some hardcoded assumptions
about the strategy. The main one I've noticed is that type0 is always legalized
before type1 which is not a good strategy for `type0 = G_EXTRACT type1, ...` if
you need to widen the container. It will converge on the same result eventually
but it will take a much longer route when legalizing type0 than if you legalize
type1 first.
Lastly, the definition of legality and the legalization strategy is kept
separate which is not ideal. It's helpful to be able to look at a one piece of
code and see both what is legal and the method the legalizer will use to make
illegal MIR more legal.
This patch adds a layer onto the LegalizerInfo (to be removed when all targets
have been migrated) which resolves all these issues.
Here are the rules for shift and division:
for (unsigned BinOp : {G_LSHR, G_ASHR, G_SDIV, G_UDIV})
getActionDefinitions(BinOp)
.legalFor({s32, s64}) // If type0 is s32/s64 then it's Legal
.clampScalar(0, s32, s64) // If type0 is <s32 then WidenScalar to s32
// If type0 is >s64 then NarrowScalar to s64
.widenScalarToPow2(0) // Round type0 scalars up to powers of 2
.unsupported(); // Otherwise, it's unsupported
This describes everything needed to both define legality and describe how to
make illegal things legal.
Here's an example of a complex rule:
getActionDefinitions(G_INSERT)
.unsupportedIf([=](const LegalityQuery &Query) {
// If type0 is smaller than type1 then it's unsupported
return Query.Types[0].getSizeInBits() <= Query.Types[1].getSizeInBits();
})
.legalIf([=](const LegalityQuery &Query) {
// If type0 is s32/s64/p0 and type1 is a power of 2 other than 2 or 4 then it's legal
// We don't need to worry about large type1's because unsupportedIf caught that.
const LLT &Ty0 = Query.Types[0];
const LLT &Ty1 = Query.Types[1];
if (Ty0 != s32 && Ty0 != s64 && Ty0 != p0)
return false;
return isPowerOf2_32(Ty1.getSizeInBits()) &&
(Ty1.getSizeInBits() == 1 || Ty1.getSizeInBits() >= 8);
})
.clampScalar(0, s32, s64)
.widenScalarToPow2(0)
.maxScalarIf(typeInSet(0, {s32}), 1, s16) // If type0 is s32 and type1 is bigger than s16 then NarrowScalar type1 to s16
.maxScalarIf(typeInSet(0, {s64}), 1, s32) // If type0 is s64 and type1 is bigger than s32 then NarrowScalar type1 to s32
.widenScalarToPow2(1) // Round type1 scalars up to powers of 2
.unsupported();
This uses a lambda to say that G_INSERT is unsupported when type0 is bigger than
type1 (in practice, this would be a default rule for G_INSERT). It also uses one
to describe the legal cases. This particular predicate is equivalent to:
.legalFor({{s32, s1}, {s32, s8}, {s32, s16}, {s64, s1}, {s64, s8}, {s64, s16}, {s64, s32}})
In terms of performance, I saw a slight (~6%) performance improvement when
AArch64 was around 30% ported but it's pretty much break even right now.
I'm going to take a look at constexpr as a means to reduce the initialization
cost.
Future work:
* Make it possible for opcodes to share rulesets. There's no need for
G_LSHR/G_ASHR/G_SDIV/G_UDIV to have separate rule and ruleset objects. There's
no technical barrier to this, it just hasn't been done yet.
* Replace the type-index numbers with an enum to get .clampScalar(Type0, s32, s64)
* Better names for things like .maxScalarIf() (clampMaxScalar?) and the vector rules.
* Improve initialization cost using constexpr
Possible future work:
* It's possible to make these rulesets change the MIR directly instead of
returning a description of how to change the MIR. This should remove a little
overhead caused by parsing the description and routing to the right code, but
the real motivation is that it removes the need for LegalizeAction::Custom.
With Custom removed, there's no longer a requirement that Custom legalization
change the opcode to something that's considered legal.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner
Reviewed By: bogner
Subscribers: hintonda, bogner, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42251
llvm-svn: 323681
We now test that pic and static produce different results for bar.
The function names were demangled.
The attributes are written inline.
llvm-svn: 323680
Summary:
This disables some of the most commonly used text proto delimiters and functions
for google style until we resolve several style options for that style.
In particular, wheter there should be a space surrounding braces ``msg { sub { key : value } }``
and the extent of packing of submessages on a same line.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D42651
llvm-svn: 323678
Summary:
Fix a few places that were modifying code after register
allocation to set the renamable bit correctly to avoid failing the
validation added in D42449.
llvm-svn: 323675
Fix the Linux plugin lookup path to include appropriate libdir suffix
for the system. To accomplish this, store the value of
LLVM_LIBDIR_SUFFIX in lldb/Host/Config.h as LLDB_LIBDIR_SUFFIX,
and use this variable when defining the plugin path.
Differential Revision: https://reviews.llvm.org/D42317
llvm-svn: 323673
Summary:
The improvements to the LegalizerInfo discussed in D42244 require that
LegalizerInfo::LegalizeAction be available for use in other classes. As such,
it needs to be moved out of LegalizerInfo. This has been done separately to the
next patch to minimize the noise in that patch.
llvm-svn: 323669
Summary:
`clang-format -dump-config path/to/file.h` never passed
anything for the Code parameter to clang::format::getStyle().
This meant the logic to guess Objective-C from the contents
of a .h file never worked, because LibFormat didn't have the
code to work with.
With this fix, we now correctly read in the contents of the
file if possible with -dump-config.
I had to update the lit config for test/Format/ because
the default config ignores .h files.
Test Plan: make -j12 check-clang
Reviewers: jolesiak, krasimir
Reviewed By: jolesiak, krasimir
Subscribers: Wizard, klimek, cfe-commits, djasper
Differential Revision: https://reviews.llvm.org/D42395
llvm-svn: 323668
Microsoft Visual Studio rejects the static constexpr static list of
atoms even though it's valid C++. This provides a workaround to unbreak
the bots.
llvm-svn: 323667
Autoconf and some other systems tend to add essential compilation
options to CC (e.g. -std=gnu99). When running such an auto-generated
makefile, scan-build does not need to change CC and CXX as they are
already set to use ccc-analyzer by a configure script.
Implement a new option --keep-cc as was proposed in this discussion:
http://lists.llvm.org/pipermail/cfe-dev/2013-September/031832.html
Patch by Paul Fertser!
llvm-svn: 323665
clang's -x option doesn't accept c-cpp-output as a language (even though
463eb6ab was merged, the driver still doesn't handle that).
This bug prevents testing C language projects when ccache is used.
Fixes#25851.
Investigation and patch by Dave Rigby.
llvm-svn: 323664
Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38697
llvm-svn: 323662
Summary:
o Replace the existing clangd::URI with a wrapper of FileURI which also
carries a resolved file path.
o s/FileURI/URI/
o Get rid of the URI hack in vscode extension.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D42419
llvm-svn: 323660
MSVC complains that the constexpr "expression did not evaluate to a
constant". Trying to make it happy by adding a `const` specifier as
suggested in https://stackoverflow.com/questions/37574343.
llvm-svn: 323659
There was a failure on a bot:
http://lab.llvm.org:8011/builders/clang-cmake-mipsel/builds/1283
strerror test is indeed flaky. We protect all races by a barrier in other tests
to eliminate flakiness. Do the same here.
No idea why tls_race2.cc failed. Add output at the end of the test
as we do in other tests. Sometimes test process crashes somewhere
in the middle (e.g. during race reporting) and it looks like empty output.
Output at the end of test allows to understand if the process has crashed,
or it has finished but produced no race reports.
Reviewed in https://reviews.llvm.org/D42633
llvm-svn: 323657
This patch adds support for generating accelerator tables in dsymutil.
This feature was already present in our internal repository but not yet
upstreamed because it requires changes to the Apple accelerator table
implementation.
Differential revision: https://reviews.llvm.org/D42501
llvm-svn: 323655
This patch renames DwarfAccelTable.{h,cpp} to AccelTable.{h,cpp} and
moves the header to the include dir so it is accessible by the
dsymutil implementation.
Differential revision: https://reviews.llvm.org/D42529
llvm-svn: 323654
This patch refactors the way data is stored in the accelerator table and
makes them truly generic. There have been several attempts to do this in
the past:
- D8215 & D8216: Using a union and partial hardcoding.
- D11805: Using inheritance.
- D42246: Using a callback.
In the end I didn't like either of them, because for some reason or
another parts of it felt hacky or decreased runtime performance. I
didn't want to completely rewrite them as I was hoping that we could
reuse parts for the successor in the DWARF standard. However, it seems
less and less likely that there will be a lot of opportunities for
sharing code and/or an interface.
Originally I choose to template the whole class, because it introduces
no performance overhead compared to the original implementation.
We ended up settling on a hybrid between a templated method and a
virtual call to emit the data. The motivation is that we don't want to
increase code size for a feature that should soon be superseded by the
DWARFv5 accelerator tables. While the code will continue to be used for
compatibility, it won't be on the hot path. Furthermore this does not
regress performance compared to Apple's internal implementation that
already uses virtual calls for this.
A quick summary for why these changes are necessary: dsymutil likes to
reuse the current implementation of the Apple accelerator tables.
However, LLDB expects a slightly different interface than what is
currently emitted. Additionally, in dsymutil we only have offsets and no
actual DIEs.
Although the patch suggests a lot of code has changed, this change is
pretty straightforward:
- We created an abstract class `AppleAccelTableData` to serve as an
interface for the different data classes.
- We created two implementations of this class, one for type tables and
one for everything else. There will be a third one for dsymutil that
takes just the offset.
- We use the supplied class to deduct the atoms for the header which
makes the structure of the table fully self contained, although not
enforced by the interface as was the case for the fully templated
approach.
- We renamed the prefix from DWARF- to Apple- to make space for the
future implementation of .debug_names.
This change is NFC and relies on the existing tests.
Differential revision: https://reviews.llvm.org/D42334
llvm-svn: 323653
This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices.
Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs.
The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target.
Differential revision: https://reviews.llvm.org/D14254
llvm-svn: 323649
The test was failing because of an incorrect sizeof check in the name
index parsing code. This code was meant to check that we have enough
input to parse the fixed-size part of the dwarf header, which it did by
comparing the input to sizeof(Header). Originally struct Header only
contained the fixed-size part, but during review, we've moved additional
members into it, which rendered the sizeof check invalid.
I resolve this by moving the fixed-size part to a separate struct and
updating the sizeof-expression to use that.
llvm-svn: 323648
Summary:
All variants of isLogicalImm[Not](32|64) can be combined into a single templated function, same for printLogicalImm(32|64).
By making it use a template instead, further SVE patches can use it for other data types as well (e.g. 8, 16 bits).
Reviewers: fhahn, rengolin, aadg, echristo, kristof.beyls, samparker
Reviewed By: samparker
Subscribers: aemerson, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D42294
llvm-svn: 323646
`num_args` is unsigned integer, declared as below:
```
uint32_t num_args = arg_enum->getChildCount();
```
Comparison with the signed `arg_idx` produces a warning when compiled with
-Wsign-compare flag, this patch addresses this simple issue without affecting
any functionality.
Reviewers: davide, asmith
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D42620
llvm-svn: 323645
Summary:
When emitting the location for a global variable with fragmented debug
expressions, make sure that the offset pieces, which represent
optimized-out parts of the variable, are emitted before their succeeding
fragments' expressions. Previously, if the succeeding fragment's
location was a symbol, the offset piece was emitted after, rather than
before, that symbol's expression. This effectively meant that the symbols
were associated with the wrong parts of the variable.
This fixes PR36085.
Patch by: David Stenberg
Reviewers: aprantl, probinson, dblaikie
Reviewed By: aprantl
Subscribers: JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D42527
llvm-svn: 323644
Summary: This was broken long ago in D12208, which failed to account for
the fact that 64-bit SPARC uses a stack bias of 2047, and it is the
*unbiased* value which should be aligned, not the biased one. This was
seen to be an issue with Rust.
Patch by: jrtc27 (James Clarke)
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: jacob_hansen, JDevlieghere, fhahn, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39425
llvm-svn: 323643
The call to ScopedPrinter::printNumber with size_t argument was
ambiguous (I think) on 32-bit builds. Explicitly cast to a 64-bit int to
avoid this.
llvm-svn: 323642
Summary:
This modifies the dwarfdump output to align it with the new .debug_names
dump. It also renames two header fields to match similar fields in the
dwarf5 header.
A couple of tests needed to be updated to match new output. The changes
were fairly straight-forward, although not really automatable.
Reviewers: JDevlieghere, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42415
llvm-svn: 323641