This patch fixes an invalid memory read introduced by r346487.
Before this patch, partial register write had to query the latency of the
dependent full register write by calling a method on the full write descriptor.
However, if the full write is from an already retired instruction, chances are
that the EntryStage already reclaimed its memory.
In some parial register write tests, valgrind was reporting an invalid
memory read.
This change fixes the invalid memory access problem. Writes are now responsible
for tracking dependent partial register writes, and notify them in the event of
instruction issued.
That means, partial register writes no longer need to query their associated
full write to check when they are ready to execute.
Added test X86/BtVer2/partial-reg-update-7.s
llvm-svn: 347459
A consequence of r347274 is that SCALAR_TO_VECTOR can be converted into
BUILD_VECTOR by SimplifyDemandedBits, but LowerBUILD_VECTOR can turn
BUILD_VECTOR into SCALAR_TO_VECTOR so we get an infinite loop.
Fix this by making LowerBUILD_VECTOR not do this transformation for those
vectors that would get transformed back, i.e. BUILD_VECTOR of a single-element
constant vector. Doing that means we get a DUP, which we then need to recognise
in ISel as a copy.
llvm-svn: 347456
Now it returns Symbol. This should be NFC that
avoids doing cast at the caller's sides.
Differential revision: https://reviews.llvm.org/D54627
llvm-svn: 347455
a normal base class that provides all common "call" functionality.
This merges two complex CRTP mixins for the common "call" logic and
common operand bundle logic into a single, normal base class of
`CallInst` and `InvokeInst`. Going forward, users can typically
`dyn_cast<CallBase>` and use the resulting API. No more need for the
`CallSite` wrapper. I'm planning to migrate current usage of the wrapper
to directly use the base class and then it can be removed, but those are
simpler and much more incremental steps. The big change is to introduce
this abstraction into the type system.
I've tried to do some basic simplifications of the APIs that I couldn't
really help but touch as part of this:
- I've tried to organize the attribute API and bundle API into groups to
make understanding the API of `CallBase` easier. Without this,
I wasn't able to navigate the API sanely for all of the ways I needed
to modify it.
- I've added what seem like more clear and consistent APIs for getting
at the called operand. These ended up being especially useful to
consolidate the *numerous* duplicated code paths trying to do this.
- I've largely reworked the organization and implementation of the APIs
for computing the argument operands as they needed to change to work
with the new subclass approach.
To minimize any cost associated with this abstraction, I've moved the
operand layout in memory to store the called operand last. This makes
its position relative to the end of the operand array the same,
regardless of the subclass. It should make it much cheaper to reference
from the `CallBase` abstraction, and this is likely one of the most
frequent things to query.
We do still pay one abstraction penalty here: we have to branch to
determine whether there are 0 or 2 extra operands when computing the end
of the argument operand sequence. However, that seems both rare and
should optimize well. I've implemented this in a way specifically
designed to allow it to optimize fairly well. If this shows up in
profiles, we can add overrides of the relevant methods to the subclasses
that bypass this penalty. It seems very unlikely that this will be an
issue as the code was *already* dealing with an ever present abstraction
of whether or not there are operand bundles, so this isn't the first
branch to go into the computation.
I've tried to remove as much of the obvious vestigial API surface of the
old CRTP implementation as I could, but I suspect there is further
cleanup that should now be possible, especially around the operand
bundle APIs. I'm leaving all of that for future work in this patch as
enough things are changing here as-is.
One thing that made this harder for me to reason about and debug was the
pervasive use of unsigned values in subtraction and other arithmetic
computations. I had to debug more than one unintentional wrap. I've
switched a few of these to use `int` which seems substantially simpler,
but I've held back from doing this more broadly to avoid creating
confusing divergence within a single class's API.
I also worked to remove all of the magic numbers used to index into
operands, putting them behind named constants or putting them into
a single method with a comment and strictly using the method elsewhere.
This was necessary to be able to re-layout the operands as discussed
above.
Thanks to Ben for reviewing this (somewhat large and awkward) patch!
Differential Revision: https://reviews.llvm.org/D54788
llvm-svn: 347452
Summary:
- Reads are never executed if canceled before ready-to run.
In practice, we finalize cancelled reads eagerly and out-of-order.
- Cancelled reads don't prevent prior updates from being elided, as they don't
actually depend on the result of the update.
- Updates are downgraded from WantDiagnostics::Yes to WantDiagnostics::Auto when
cancelled, which allows them to be elided when all dependent reads are
cancelled and there are subsequent writes. (e.g. when the queue is backed up
with cancelled requests).
The queue operations aren't optimal (we scan the whole queue for cancelled
tasks every time the scheduler runs, and check cancellation twice in the end).
However I believe these costs are still trivial in practice (compared to any
AST operation) and the logic can be cleanly separated from the rest of the
scheduler.
Reviewers: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54746
llvm-svn: 347450
Summary:
It will cause test tools `FileCheck`, `count`, `not` being built blindly, these
dependencies should move back to clang-tools-extra.
Reviewers: mgorny
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54797
llvm-svn: 347448
r334871 has made it possible for TableGen'erated code to select BFC, but
it has not added a test for it on the ARM side. Add it now to make sure
we don't introduce regressions if we ever change anything about that
rule.
llvm-svn: 347447
Implement getIntrinsicInstrCost() and return costs reflecting that bswap can
be done with a vperm per vector register.
Review: Ulrich Weigand
https://reviews.llvm.org/D54789
llvm-svn: 347445
These changed as a result of r347379. Unfortunately there was a
regression; filed PR39748 to track it.
Differential Revision: https://reviews.llvm.org/D54821
llvm-svn: 347442
GCC does it this way, and we have to be consistent. This includes
stdcall and fastcall functions with suffixes. I confirmed that a
fastcall function named "foo" ends up in ".text$foo", not
".text$@foo@8".
Based on a patch by Andrew Yohn!
Fixes PR39218.
Differential Revision: https://reviews.llvm.org/D54762
llvm-svn: 347431
This reverts commit r347413: older versions of ld.gold that are used
by Android don't support --push/pop-state which broke sanitizer bots.
llvm-svn: 347430
Previously we were taking over 13 minutes to link Firefox's xul.dll
on ARM64; this reduces link time to around 18s on my machine.
The root cause of the problem was that all of the input .pdata sections
had the same unrelocated section data and therefore the same hash,
which made segregation quadratic in the number of .pdata sections. The
reason why we weren't observing this on other architectures was that
ARM has a different .pdata format. On non-ARM the format is (start
address, end address, .xdata), which caused the size of the function
to appear in the unrelocated section data where the end address field
is. However, the ARM format omits the end address field.
Fixes PR39667.
Differential Revision: https://reviews.llvm.org/D54809
llvm-svn: 347429
This is further cleanup for PPCMCCodeEmitter. The class had been contained
within the cpp file alone. Now it has been split up between a header file and
a cpp file which allows other classes to make use of the functions in this class
if required.
llvm-svn: 347428
We don't support mac OS 10.6 and older anymore, so this macro can never
be defined. This bit of code had been added in D28931 as a fix for
PR31448, but it doesn't seem necessary anymore.
llvm-svn: 347427
For the NVPTX target default locations should be emitted as constants +
additional info must be emitted in the reserved_2 field of the ident_t
structure. The 1st bit controls the execution mode and the 2nd bit
controls use of the lightweight runtime. The combination of the bits for
Non-SPMD mode + lightweight runtime represents special undefined mode,
used outside of the target regions for orphaned directives or functions.
Should allow and additional optimization inside of the target regions.
llvm-svn: 347425
This transform needs to be limited.
We are converting to a constant pool load very early, and we
are turning loads that are independent of the select condition
(and therefore speculatable) into a dependent non-speculatable
load.
We may also be transferring a condition code from an FP register
to integer to create that dependent load.
llvm-svn: 347424
The iterator types for different specializations of containers with the
same element type but different allocators are not required to be
convertible. This patch makes the test to take the iterator type from
the same container specialization as the created container.
Reviewed as https://reviews.llvm.org/D54806.
Thanks to Andrey Maksimov for the patch.
llvm-svn: 347423
I wasn't able to reproduce the issue referred to by the comment using
the libc++'s shipped with mac OS X 10.7 and 10.8, so I assume this may
have been fixed in a function that is now shipped in the headers. In
that case, the tests will pass no matter what dylib we're using.
In the worst case, some test bots will start failing and I'll understand
why I was wrong, and I can create an actual lit feature for it. Note
that I could just leave this test alone, but this change is on the path
towards eradicating vendor-specific availability markup from the test
suite.
llvm-svn: 347421
Summary:
Currently we can't install the modulemaps provided by LLVM, since they are not structured to support headers generated as part of the build (ex. `llvm/IR/Attributes.gen`).
This patch restructures the module maps in order to support installation.
Modules containing generated headers are defined in the new `module.extern.modulemap` file, and are referenced from the main `module.modulemap` using `extern module`. There are two versions of the `module.extern.modulemap` file; one used when building and another, `module.install.modulemap`, which is re-named during installation.
Users can opt-into module map installation using `-DLLVM_INSTALL_MODULEMAPS=ON`. The default value is `OFF` due to llvm.org/PR31905.
Reviewers: rsmith, mehdi_amini, bruno, EricWF
Reviewed By: EricWF
Subscribers: tschuett, chapuni, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D53510
llvm-svn: 347420
Summary:
D48660 / rL335762 added a `silence_unsigned_overflow` env flag for [[ https://github.com/google/oss-fuzz/pull/1717 | oss-fuzz needs ]],
that allows to silence the reports from unsigned overflows.
It makes sense, it is there because `-fsanitize=integer` sanitizer is not enabled on oss-fuzz,
so this allows to still use it as an interestingness signal, without getting the actual reports.
However there is a slight problem here.
All types of unsigned overflows are ignored.
Even if `-fno-sanitize-recover=unsigned` was used (which means the program will die after the report)
there will still be no report, the program will just silently die.
At the moment there are just two projects on oss-fuzz that care:
* [[ 8eeffa627f/projects/llvm_libcxx/build.sh (L18-L20) | libc++ ]]
* [[ 8eeffa627f/projects/librawspeed/build.sh | RawSpeed ]] (me)
I suppose this could be overridden there ^, but i really don't think this is intended behavior in any case..
Reviewers: kcc, Dor1s, #sanitizers, filcab, vsk, kubamracek
Reviewed By: Dor1s
Subscribers: dberris, mclow.lists, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D54771
llvm-svn: 347415
Sanitizer runtime link deps handling passes --no-as-needed because of
PR15823, but it never undoes it and this flag may affect other libraries
that come later on the link line. To avoid this, wrap Sanitizer link
deps in --push/pop-state.
Differential Revision: https://reviews.llvm.org/D54805
llvm-svn: 347413