This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 .
The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found.
Please see the lit test for some examples.
Committed on behalf of @vporpo (Vasileios Porpodas)
Differential Revision: https://reviews.llvm.org/D62427
llvm-svn: 362613
Instead of passing around fast-math-flags as a parameter, we can set those
using an IRBuilder guard object. This is no-functional-change-intended.
The motivation is to eventually fix the vectorizers to use and set the
correct fast-math-flags for reductions. Examples of that not behaving as
expected are:
https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast')
https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0)
D61802 (should be able to reduce with IR-level FMF)
Differential Revision: https://reviews.llvm.org/D62272
llvm-svn: 362612
As reported in https://bugs.llvm.org/show_bug.cgi?id=42113, there are a
number of locations in Clang where it is assumed that exception
specifications are only valid in C++ mode. Since the original
justification for the NoThrow Exception Specifier Type was C++ related,
this patch just makes C mode use the attribute-based nothrow handling.
Additionally, I noticed that the handling of non-prototype functions
regressed the behavior of the nothrow attribute, in part because it is
was listed in the function type macro(which I did in the previous
patch). In reality, it should only be doing so in a conditional nature,
so this patch removes it there and puts it directly in the switch to be
handled correctly.
llvm-svn: 362607
References to arbitrary address spaces can't always be bound to
temporaries. This change extends the reference binding logic to
check that the address space of a temporary can be implicitly
converted to the address space in a reference when temporary
materialization is performed.
Differential Revision: https://reviews.llvm.org/D61318
llvm-svn: 362604
We have a few sections that can be added implicitly to the output:
".dynsym", ".dynstr", ".symtab", ".strtab" and ".shstrtab".
Problem appears when such section is listed explicitly in YAML.
In that case it's content is written twice:
first time during writing of regular sections listed in the document
and second time during special handling.
Because of that their file offsets can become unexpectedly broken:
(yaml file for sample below lists .dynsym explicitly before .text.foo)
Before patch:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .dynsym DYNSYM 0000000000000100 00000250
0000000000000030 0000000000000018 A 6 0 8
[ 2] .text.foo PROGBITS 0000000000000200 00000200
0000000000000000 0000000000000000 AX 0 0 0
After patch:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .dynsym DYNSYM 0000000000000100 00000200
0000000000000030 0000000000000018 A 6 0 8
[ 2] .text.foo PROGBITS 0000000000000200 00000230
0000000000000000 0000000000000000 AX 0 0 0
This patch reorganizes our code and fixes the issue described.
Differential revision: https://reviews.llvm.org/D62809
llvm-svn: 362602
Now, when clang processes an argument of the form "-march=foo+x+y+z",
then instead of calling getArchExtFeature() for each of the extension
names "x", "y", "z" and appending the returned string to its list of
low-level subtarget features, it will call appendArchExtFeatures()
which does the appending itself.
The difference is that appendArchExtFeatures can add _more_ than one
low-level feature name to the output feature list if it has to, and
also, it gets told some information about what base architecture and
CPU the extension is going to go with, which means that "+fp" can now
mean something different for different CPUs. Namely, "+fp" now selects
whatever the _default_ FPU is for the selected CPU and/or
architecture, as defined in the ARM_ARCH or ARM_CPU_NAME macros in
ARMTargetParser.def.
On the clang side, I adjust DecodeARMFeatures to call the new
appendArchExtFeatures function in place of getArchExtFeature. This
means DecodeARMFeatures needs to be passed a CPU name and an ArchKind,
which meant changing its call sites to make those available, and also
sawing getLLVMArchSuffixForARM in half so that you can get an ArchKind
enum value out of it instead of a string.
Also, I add support here for the extension name "+fp.dp", which will
automatically look through the FPU list for something that looks just
like the default FPU except for also supporting double precision.
Differential Revision: https://reviews.llvm.org/D60697
llvm-svn: 362601
This is the LLVM part of this change, the Clang part contains the full
description in its commit message.
Differential Revision: https://reviews.llvm.org/D60697
llvm-svn: 362600
We already handle the case where we combine shuffle(extractsubvector(x),extractsubvector(x)), this relaxes the requirement to permit different sources as long as they have the same value type.
This causes a couple of cases where the VPERMV3 binary shuffles occur at a wider width than before, which I intend to improve in future commits - but as only the subvector's mask indices are defined, these will broadcast so we don't see any increase in constant size.
llvm-svn: 362599
Remove irrelevant options from standard help output.
New output:
OVERVIEW: llvm object size dumper
USAGE: llvm-size [options] <input files>
OPTIONS:
Generic Options:
--help - Display available options (--help-hidden for more)
--help-list - Display list of available options (--help-list-hidden for more)
--version - Display the version of this program
llvm-size Options:
Specify output format
-A - System V format
-B - Berkeley format
-m - Darwin -m format
--arch=<string> - architecture(s) from a Mach-O file to dump
--common - Print common symbols in the ELF file. When using Berkely format, this is added to bss.
Print size in radix:
-o - Print size in octal
-d - Print size in decimal
-x - Print size in hexadecimal
--format=<value> - Specify output format
=sysv - System V format
=berkeley - Berkeley format
=darwin - Darwin -m format
-l - When format is darwin, use long format to include addresses and offsets.
--radix=<value> - Print size in radix
=8 - Print size in octal
=10 - Print size in decimal
=16 - Print size in hexadecimal
--totals - Print totals of all objects - Berkeley format only
Differential Revision: https://reviews.llvm.org/D62482
llvm-svn: 362593
@jdoerfert Looks like these are placeholders for incoming abstract attributes patches so I've just commented the code out, even though this is usually frowned upon.
llvm-svn: 362592
Although many relocatable objects will have a single
GNU_PROPERTY_X86_FEATURE_1_AND in the .note.gnu.property section it is
permissible to have more than one, and there are tests in ld.bfd that use
it. The behavior that ld.bfd follows is to set the feature bit for a
relocatable object if any of the GNU_PROPERTY_X86_FEATURE_1_AND
have the feature bit set.
Differential Revision: https://reviews.llvm.org/D62862
llvm-svn: 362591
Summary:
If the provided LLVM build-tree used a multi-configuration generator like Xcode, `LLVM_TOOLS_BINARY_DIR` will have a generator-specific placeholder to express `CMAKE_CFG_INTDIR`. Thus `llvm-lit` and `llvm-tblgen` won't be found.
D62878 exports the actual configuration types so we can fix the path and add them to the search paths for `find_program()`.
Reviewers: xiaobai, labath, stella.stamenova
Reviewed By: xiaobai, stella.stamenova
Subscribers: mgorny, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D62879
llvm-svn: 362589
Summary: Useful info for standalone builds of subprojects. If a multi-configuration generator was used for the provided LLVM build-tree, standalone builds should consider actual subdirectories per configuration in `find_program()` (e.g. looking for `llvm-lit` or `llvm-tblgen`).
Reviewers: labath, beanz, mgorny
Subscribers: lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62878
llvm-svn: 362588
Add a test for tracking PR41027 (8.0 regression breaking assembly code
relying on __builtin_constant_p() to identify compile-time constants).
Mark it as expected to fail everywhere.
Differential Revision: https://reviews.llvm.org/D60728
llvm-svn: 362587
Summary:
r362103 exposed a bug, where we could read incorrect data if a skeleton
unit contained more than the single unit DIE. Clang emits these kinds of
units with -fsplit-dwarf-inlining (which is also the default).
Changing lldb to handle these DIEs is nontrivial, as we'd have to change
the UID encoding logic to be able to reference these DIEs, and fix up
various places which are assuming that all DIEs come from the separate
compile unit.
However, it turns out this is not necessary, as the DWO unit contains
all the information that the skeleton unit does. So, this patch just
skips parsing the extra DIEs if we have successfully found the DWO file.
This enforces the invariant that the rest of the code is already
operating under.
This patch fixes a couple of existing tests, but I've also included a
simpler test which does not depend on execution of binaries, and would
have helped us in catching this sooner.
Reviewers: clayborg, JDevlieghere, aprantl
Subscribers: probinson, dblaikie, lldb-commits
Differential Revision: https://reviews.llvm.org/D62852
llvm-svn: 362586
This reverts commit 5b32f60ec3.
The fix is in commit 4f9e68148b.
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126
llvm-svn: 362583
Tests that use -O1, -O2 and -O3 would often produce different results
with the new pass manager which makes these tests fail. Disable new PM
explicitly for these tests.
Differential Revision: https://reviews.llvm.org/D58375
llvm-svn: 362580
NOTE: Note that no attributes are derived yet. This patch will not go in
alone but only with others that derive attributes. The framework is
split for review purposes.
This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.
In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.
Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.
Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames
Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59918
llvm-svn: 362578
This commit is a preparation of upcoming patches on attribute deduction.
It will shorten the diffs and make it clear what we inferred before.
Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes
Subscribers: bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59903
llvm-svn: 362577
Generally speaking, we lower to an optimal rotate sequence for nodes visible in
the SDAG. However, there are instances where the two rotates are not visible at
ISEL time - most notably those in a very common sequence when lowering switch
statements to jump tables.
A common situation is a switch on a 32-bit integer. This value has to have the
upper 32 bits cleared and because jump table offsets are word offsets, the value
needs to be shifted left by 2 bits. We currently emit the clear and the left
shift as two separate instructions, but this is not needed as we can lower it to
a single RLDIC.
This patch just cleans that up.
Differential revision: https://reviews.llvm.org/D60402
llvm-svn: 362576
NFC commit of a test case in order for the subsequent review to show differences
in codegen.
Differential revision: https://reviews.llvm.org/D62843
llvm-svn: 362573
The age field is only there to say how many times an OBJ or a PDB was incrementally linked. It shouldn't be used to validate the link between the OBJ and the PDB.
Differential Revision: https://reviews.llvm.org/D62837
llvm-svn: 362572
Part 2 (the Clang portion) of D59881.
This patch (first of two patches) enables the vectorizer to recognize the
IBM MASS vector library routines. This patch specifically adds support for
recognizing the -vector-library=MASSV option, and defines mappings from IEEE
standard scalar math functions to generic PowerPC MASS vector counterparts.
For instance, the generic PowerPC MASS vector entry for double-precision
cbrt function is __cbrtd2_massv.
The second patch will further lower the generic PowerPC vector entries to
PowerPC subtarget-specific entries.
For instance, the PowerPC generic entry cbrtd2_massv is lowered to
cbrtd2_P9 for Power9 subtarget.
The overall support for MASS vector library is presented as such in two patches
for ease of review.
Patch by Jeeva Paudel.
Differential revision: https://reviews.llvm.org/D59881
llvm-svn: 362571
In glibc, DT_PPC_GOT indicates that PowerPC32 Secure PLT ABI is used.
I plan to use it in D62464.
DT_PPC_OPT currently indicates if a TLSDESC inspired TLS optimization is
enabled.
Reviewed By: grimar, jhenderson, rupprecht
Differential Revision: https://reviews.llvm.org/D62851
llvm-svn: 362569
Summary:
This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No.
38".
Add an assertion, since it's unlikely that this parameter is nullptr.
Reviewers: RKSimon, fhahn
Reviewed By: RKSimon
Subscribers: fhahn, llvm-commits, RKSimon, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62229
llvm-svn: 362567