Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).
This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.
V2: Ensure s0 (s8 for gfx9 merged shader) is marked live-in when loading
scratch descriptor from GIT.
Reviewers: kzhuravl, nhaehnle, timcorringham
Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm
Differential Revision: https://reviews.llvm.org/D44468
Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f
llvm-svn: 329690
Much like any written register in load/store instructions, the status register
is not allowed to overlap with any others. So diagnose it like we already do
with the other cases.
llvm-svn: 329687
The BroadwellModelProcResources had an entry for HWPort5, which is a Haswell
resource, and not a Broadwell processor resource. That entry was added to the
Broadwell model because variable blends were consuming it.
This was clearly a typo (the resource name should have been BWPort5), which
unfortunately was never caught before. It was not reported as an error because
HWPort5 is a resource defined by the Haswell model. It has been found when
testing some code with llvm-mca: the list of resources in the resource pressure
view was odd.
This patch fixes the issue; now variable blend instructions consume 2 cycles on
BWPort5 instead of HWPort5. This is enough to get rid of the extra (spurious)
entry in the BroadWellModelProcResources table.
llvm-svn: 329686
The current support of the feature produces only 2 lines in report:
-Some general Code Generation Time;
-Total time of Backend Consumer actions.
This patch extends Clang time report with new lines related to Preprocessor, Include Filea Search, Parsing, etc.
Differential Revision: https://reviews.llvm.org/D43578
llvm-svn: 329684
It looks like we introduced isprint8 way back in r169417 to be used on
getopt's short_options, which we sometimes set to values which are out
of range for normal chars to indicate options with no short form.
However, this is not how the function is used in the Args class, where
we explicitly process a string character by character.
This removes the last external dependency from the Args class.
llvm-svn: 329682
These are not used anywhere in the Args class. They should have been
moved as a part of r327110 (Moving Option parsing from Args to Options),
but I did not notice them then.
This does not affect the layering in any way, but in makes sense for the
structs to be defined in the near the code that uses them.
llvm-svn: 329679
Summary:
The idea behind this is to move the functionality which depend on other lldb
classes into a separate class. This way, the Args class can be turned
into a lightweight arc+argv wrapper and moved into the lower lldb
layers.
Reviewers: jingham, zturner
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D44306
llvm-svn: 329677
Summary:
Subtargets can define the libpfm counter names that can be used to
measure cycles and uops issued on ProcResUnits.
This allows making llvm-exegesis available on more targets.
Fixes PR36984.
Reviewers: gchatelet, RKSimon, andreadb, craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45360
llvm-svn: 329675
This cleans up a number of operations that only claimed te use EFLAGS
due to using DF. But no instructions which we think of us setting EFLAGS
actually modify DF (other than things like popf) and so this needlessly
creates uses of EFLAGS that aren't really there.
In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD,
and the whole-flags writes (WRFLAGS and POPF) need to model this.
I've also somewhat cleaned up some of the flag management instruction
definitions to be in the correct .td file.
Adding this extra register also uncovered a failure to use the correct
datatype to hold X86 registers, and I've corrected that as necessary
here.
Differential Revision: https://reviews.llvm.org/D45154
llvm-svn: 329673
Disabling threads makes <atomic> unusable, but this is needed by LLVM
libraries that are dependencies of the symbolizer.
Differential Revision: https://reviews.llvm.org/D45424
llvm-svn: 329672
an APValue and retrieve it from map Temporaries.
The version number is needed when a single AST node is visited multiple
times and is used to create APValues that are required to be distinct
from each other (for example, MaterializeTemporaryExprs in default
arguments and VarDecls in loops).
rdar://problem/36505742
Differential Revision: https://reviews.llvm.org/D42776
llvm-svn: 329671
Prefer to use the 32-bit AND with immediate instead.
Primarily I'm doing this to ensure that immediates created by shrinkAndImmediate will always get absorbed into the AND. But I do believe this would be a reduction in the number of uops that need to execute. Ideally we should shrink the 'and' and the 'load' during DAG combine to re-enable the fold.
Fixes PR37063.
llvm-svn: 329667
While reading Codeview records which contain variable-length encoded integers,
such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS,
the record's size would be improperly calculated in cases where the value was
indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on
the next record, which would/might crash later on.
Differential Revision: https://reviews.llvm.org/D45104
llvm-svn: 329659
The key idea is to lower COPY nodes populating EFLAGS by scanning the
uses of EFLAGS and introducing dedicated code to preserve the necessary
state in a GPR. In the vast majority of cases, these uses are cmovCC and
jCC instructions. For such cases, we can very easily save and restore
the necessary information by simply inserting a setCC into a GPR where
the original flags are live, and then testing that GPR directly to feed
the cmov or conditional branch.
However, things are a bit more tricky if arithmetic is using the flags.
This patch handles the vast majority of cases that seem to come up in
practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of
partially preserved EFLAGS as LLVM doesn't currently model that at all.
There are a large number of operations that techinaclly observe EFLAGS
currently but shouldn't in this case -- they typically are using DF.
Currently, they will not be handled by this approach. However, I have
never seen this issue come up in practice. It is already pretty rare to
have these patterns come up in practical code with LLVM. I had to resort
to writing MIR tests to cover most of the logic in this pass already.
I suspect even with its current amount of coverage of arithmetic users
of EFLAGS it will be a significant improvement over the current use of
pushf/popf. It will also produce substantially faster code in most of
the common patterns.
This patch also removes all of the old lowering for EFLAGS copies, and
the hack that forced us to use a frame pointer when EFLAGS copies were
found anywhere in a function so that the dynamic stack adjustment wasn't
a problem. None of this is needed as we now lower all of these copies
directly in MI and without require stack adjustments.
Lots of thanks to Reid who came up with several aspects of this
approach, and Craig who helped me work out a couple of things tripping
me up while working on this.
Differential Revision: https://reviews.llvm.org/D45146
llvm-svn: 329657
A check in assert-builds was meant to verify that a load provides a
value in all statement instances (i.e. its domain). The domain is
commonly gist'ed within the parameter context to contain fewer
constraints. However, statement instances outside the context are
no valid executions, hence the value provided can be undefined.
Refine the check for valid loads to only needed to be defined within
the SCoP context.
In addition, the JSONImporter had to be changed to allow importing
access relations that are broader than the current access relation,
but still defined over all statement instances.
This should fix the compiler crash in test-suite's oggenc of the
-polly-process-unprofitable buildbot.
llvm-svn: 329655
Commit r329640 introduced the removal of all MemoryAccesses of a Scop.
It accidentally continued iterating over a vector whose iterators
have been invalidated by a MemoryAccess removal.
Make a copy of the MemoryAccesses to remove to iterate over while
removing them.
llvm-svn: 329653
GCC 4.8.4 on a bot was warning about `ArgPassingKind` not fitting in
`ArgPassingRestrictions`, which appears to be incorrect, since
`ArgPassingKind` only has three potential values:
"warning: 'clang::RecordDecl::ArgPassingRestrictions' is too small to
hold all values of 'enum clang::RecordDecl::ArgPassingKind'"
Additionally, I remember hearing (though my knowledge may be outdated)
that MSVC won't merge adjacent bitfields if their types are different.
Try to fix both issues by turning these into `uint8_t`s.
llvm-svn: 329652
v2: Fix whitespace errors
Use only subnormal path.
Passes CTS on carrizo and turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry <awatry@gmail.com>
llvm-svn: 329647
Summary:
Another clean up, following D43208.
Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it.
In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that,
by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created.
After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis.
Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson
Reviewed By: rengolin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45072
llvm-svn: 329645
Summary:
SSAUpdater is a bottleneck in JumpThreading, and this patch improves the
situation by using SSAUpdaterBulk instead.
Compile time impact: no noticable changes on CTMark, a big improvement
on the test from PR16756.
Reviewers: dberlin, davide, MatzeB
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D44282
llvm-svn: 329644
Summary:
SSAUpdater is a bottleneck in a number of passes, and one of the reasons
is that it performs a lot of unnecessary computations (DT/IDF) over and
over again. This patch adds a new SSAUpdaterBulk that uses existing DT
and avoids recomputing IDF when possible.
Reviewers: dberlin, davide, MatzeB
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D44282
llvm-svn: 329643
Removing a statement left its MemoryAccesses in some lists and maps of
the SCoP. Which lists depends on at which phase of the SCoP
construction the statement is deleted. Follow-up passes could still see
the already deleted MemoryAccesses by iterating through these
lists/maps, resulting in an access violation.
When removing a ScopStmt, also remove all its MemoryAccesses by using
the same mechnism that removes a MemoryAccess.
llvm-svn: 329640
std::remove, despite its name, does not remove elements from a list, but
only moves them to the end of a list. Call erase() to shorten the
vector to the remaining elements.
Test case included in next commit.
llvm-svn: 329639
The caching walker used to hold its own caches, which made its `reset()`
function meaningful. Since caching has been moved out of it, there's no
reason to continue to have these cache-related methods.
Similarly, the EXPENSIVE_CHECKS block that's getting removed used to
rerun the query with caching disabled. Since that's how we always do
queries now, it's redundant.
llvm-svn: 329638
I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.
--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.
lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,
ld.lld foo.a bar.o
succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.
In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.
That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers. With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.
The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts
ld.lld foo.a bar.a
even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.
Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.
A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.
ld.lld A B --start-group C D --end-group E
A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.
Differential Revision: https://reviews.llvm.org/D45195
llvm-svn: 329636
registers.
This patch fixes a bug in r328731 that caused structs transitively
containing __weak fields to be passed in registers. The patch replaces
the flag RecordDecl::CanPassInRegisters with a 2-bit enum that indicates
whether the struct or structs containing the struct are forced to be
passed indirectly.
This reapplies r329617. r329617 didn't specify the underlying type for
enum ArgPassingKind, which caused regression tests to fail on a windows
bot.
rdar://problem/39194693
Differential Revision: https://reviews.llvm.org/D45384
llvm-svn: 329635
Summary:
- getentropy presence since late 2014, safe to use.
- guarantees to delivers good random data up to 256 bytes.
- fall back to /dev/urandom as long the buffer is correct.
Patch by David CARLIER
Reviewers: kubamracek, vitalybuka
Reviewed By: vitalybuka
Subscribers: cryptoad, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D44866
llvm-svn: 329633