--push-state implemented in this patch saves the states of --as-needed,
--whole-archive and --static. It saves less number of flags than GNU linkers.
Since even GNU linkers save different flags, no one seems to care about the
details. In this patch, I tried to save the minimal number of flags to not
complicate the implementation and the siutation.
I'm not personally happy about adding the --{push,pop}-state flags though.
That options seem too hacky to me. However, gcc started using the options
since GCC 8 when GNU ld is available at the build time. Therefore, lld
is no longer a drop-in replacmenet for GNU linker for that machine
without supporting the flags.
Fixes https://bugs.llvm.org/show_bug.cgi?id=34567
Differential Revision: https://reviews.llvm.org/D47542
llvm-svn: 333646
Convert a vector load intrinsic into an llvm load instruction.
This is beneficial when the underlying object being addressed
comes from a constant, since we get constant-folding for free.
Differential Revision: https://reviews.llvm.org/D46273
llvm-svn: 333643
Summary:
- I've measured the values for Broadwell, Haswell, SandyBridge, Skylake.
- For ZnVer1 and Atom, values were transferred form `InstRW`s.
- For SLM and BtVer2, values are from Agner.
This is split off from https://reviews.llvm.org/D47377
Reviewers: RKSimon, andreadb
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D47523
llvm-svn: 333642
This improves splat rotations (rotation by an uniform value), to avoid having to use the generic non-uniform shift code (extension to PR37426).
llvm-svn: 333641
This test was using unittest (not unittest2) as the test framework, and
it worked with dotest only by accident. Remove it as we have a much more
realistic example test in test/testcases/sample_test.
llvm-svn: 333640
Summary:
As discussed in https://bugs.llvm.org/show_bug.cgi?id=37317,
FindGlobalVariables does not properly handle the case where
append=false. As this doesn't seem to be used in the tree, this patch
removes the parameter entirely.
Reviewers: clayborg, jingham, labath
Reviewed By: clayborg
Subscribers: aprantl, lldb-commits, kubamracek, JDevlieghere
Differential Revision: https://reviews.llvm.org/D46885
Patch by Tom Tromey <ttromey@mozilla.com>.
llvm-svn: 333639
Summary:
In rL327851 the createUniqueFile() and createTemporaryFile()
variants that do not return the file descriptors were changed to
create empty files, rather than only check if the paths are free.
This change was done in order to make the functions race-free.
That change led to clang-tidy (and possibly other tools) leaving
behind temporary assembly files, of the form placeholder-*, when
using a target that does not support the internal assembler.
The temporary files are created when building the Compilation
object in stripPositionalArgs(), as a part of creating the
compilation database for the arguments after the double-dash. The
files are created by Driver::GetNamedOutputPath().
Fix this issue by cleaning out temporary files at the deletion of
Compilation objects.
This fixes https://bugs.llvm.org/show_bug.cgi?id=37091.
Reviewers: klimek, sepavloff, arphaman, aaron.ballman, john.brawn, mehdi_amini, sammccall, bkramer, alexfh, JDevlieghere
Reviewed By: aaron.ballman, JDevlieghere
Subscribers: erichkeane, lebedev.ri, Ka-Ka, cfe-commits
Differential Revision: https://reviews.llvm.org/D45686
llvm-svn: 333637
rL145086 introduced m_die_array.shrink_to_fit() implemented by
exact_size_die_array.swap, it was before LLVM became written in C++11.
Differential revision: https://reviews.llvm.org/D47492
llvm-svn: 333636
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.
For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.
The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.
The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D47543
llvm-svn: 333635
Summary:
I'm slowly looking into a new X86 scheduler model,
for AMD Bulldozer CPU, model 2 (bdver2, Piledriver).
And naturally, i have hit that assert :)
I happened to know what it meant, and how to fix it,
but that is not too common knowledge.
Reviewers: courbet, RKSimon
Reviewed By: courbet
Subscribers: tschuett, llvm-commits, craig.topper
Differential Revision: https://reviews.llvm.org/D47572
llvm-svn: 333632
In post-commit review, Eric Christopher notes that many
new MSan warnings are being observed with this patch.
The probable reason is: if 'y' is undef here and we could
evaluate it twice and get different results.
We can't increase the number of uses of a value.
llvm-svn: 333631
Keep track of achieved occupancy in SIMachineFunctionInfo.
At the moment we have a lot of duplicated or even missed code to
query and maintain occupancy info. Record it in the MFI and
query in a single call. Interfaces:
- getOccupancy() - returns current recorded achieved occupancy.
- getMinAllowedOccupancy() - returns lesser of the achieved occupancy
and the lowest occupancy we are ready to tolerate. For example if
a kernel is memory bound we are ready to tolerate 4 waves.
- limitOccupancy() - record occupancy level if we have to lower it.
- increaseOccupancy() - record occupancy if scheduler managed to
increase the occupancy.
MFI takes care of integrating different checks affecting occupancy,
including LDS use and waves-per-eu attribute. Note that scheduler
starts with not yet known register pressure, so has to record either
limit or increase in occupancy after it is done. Later passes can
just query a resulting value.
New interface is used in the active scheduler and NFC wrt its work.
Changes are also made to experimental schedulers to use it and record
an occupancy after they are done. Before the change waves-per-eu was
ignored by experimental schedulers and tolerance window for memory
bound kernels was not used.
Differential Revision: https://reviews.llvm.org/D47509
llvm-svn: 333629
Summary:
This is part of the larger XRay Profiling Mode effort.
This patch implements a centralised collector for `FunctionCallTrie`
instances, associated per thread. It maintains a global set of trie
instances which can be retrieved through the XRay API for processing
in-memory buffers (when registered). Future changes will include the
wiring to implement the actual profiling mode implementation.
This central service provides the following functionality:
* Posting a `FunctionCallTrie` associated with a thread, to the central
list of tries.
* Serializing all the posted `FunctionCallTrie` instances into
in-memory buffers.
* Resetting the global state of the serialized buffers and tries.
Depends on D45757.
Reviewers: echristo, pelikan, kpw
Reviewed By: kpw
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D45758
llvm-svn: 333624
v2: use "ensureAlignment"
make functions cache line aligned
Fixes GPU hangs since r333219:
"AMDGPU: Split R600 AsmPrinter code into its own class"
Differential Revision: https://reviews.llvm.org/D47516
llvm-svn: 333622
Besides other changes, this update introduces functions to translate a
maps and sets into lists of their elements. These lists are useful as
we can define iterators for lists, which allow us to replace many uses
of foreach.
llvm-svn: 333621
This fixes projects/compiler-rt/test/fuzzer/sigusr.test, which was
broken by r333614. The trouble was that "&&" changes the command for
which "$!" gives the pid.
llvm-svn: 333620
The majority of the cases were correct. This fixes the few that weren't.
I also removed some superfluous parentheses in non-macros that confused by attempts at grepping for missing casts.
llvm-svn: 333615
(Relands r333584, reverted in 333592.)
When debugging test failures with -vv (or -v in the case of the
internal shell), this makes it easier to locate the RUN line that
failed. For example, clang's test/Driver/linux-ld.c has 892 total RUN
lines, and clang's test/Driver/arm-cortex-cpus.c has 424 RUN lines
after concatenation for line continuations.
When reading the generated shell script, this also makes it easier to
locate the RUN line that produced each command.
To support reporting RUN line numbers in the case of the internal
shell, this patch extends the internal shell to support the null
command, ":", except pipelines are not supported.
To support reporting RUN line numbers in the case of windows cmd.exe
as the external shell, this patch extends -vv to set "echo on" instead
of "echo off" in bat files. (Support for windows cmd.exe as a lit
external shell will likely be dropped later, but I found out too
late.)
Reviewed By: delcypher, asmith, stella.stamenova, jmorse, lebedev.ri, rnk
Differential Revision: https://reviews.llvm.org/D44598
llvm-svn: 333614
I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added.
Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load.
It also removed a bogus error message on another test.
llvm-svn: 333613
Previously, the checker was using the nullability of the expression,
which is nonnull IFF both receiver and method are annotated as _Nonnull.
However, the receiver could be known to the analyzer to be nonnull
without being explicitly marked as _Nonnull.
rdar://40635584
Differential Revision: https://reviews.llvm.org/D47510
llvm-svn: 333612
Don't always:
cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C'
This is something that came up as far back as D26556, and I lost track of it.
I suspect that this transform is part of the underlying problem that is
inspiring some of the recent proposals that seek to match larger patterns
that include a cast op. Even if that's not true, this transform causes
problems for codegen (particularly with vector types).
A transform to actively match the size of cmp and select operand sizes should
follow. This patch just removes the harmful canonicalization in the other
direction.
Differential Revision: https://reviews.llvm.org/D47163
llvm-svn: 333611
Discard the last uncompleted deferred region in a decl, if one exists.
This prevents lines at the end of a function containing only whitespace
or closing braces from being marked as uncovered, if they follow a
region terminator (return/break/etc).
The previous behavior was to heuristically complete deferred regions at
the end of a decl. In practice this ended up being too brittle for too
little gain. Users would complain that there was no way to reach full
code coverage because whitespace at the end of a function would be
marked uncovered.
rdar://40238228
Differential Revision: https://reviews.llvm.org/D46918
llvm-svn: 333609
Summary:
Fix PR37625. It's possible for an extern_weak declaration to be emitted
to the merged module when a definition exists in the ThinLTO portion of
the build; discard the linkage on the declaration in that case.
(otherwise we copy the linkage to the alias to the jumptable and fail)
Reviewers: pcc
Reviewed By: pcc
Subscribers: mehdi_amini, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D47494
llvm-svn: 333604
Most of the origial comments used C style /* */ comments, but some C++ // comments had snuck in over time.
Still need to convert all the doxygen comments. Which is much harder to do.
llvm-svn: 333603
Looks like we intended to compare this->Members with Other->Members
here, but ended up comparing this->Members with this->Members. Oops. :)
Since CongruenceClass::Members is a SmallPtrSet anyway, we can probably
skip building std::sets if we're willing to write a bit more code.
This appears to be no functional change (for sufficiently lax values of
"no"): this equality check was only being called inside of an assert.
So, worst case, we'll catch more bugs in the form of assertion failures.
Thanks to d0k for noting this!
llvm-svn: 333601