heap.py has a lot of large hand written expressions and each name in the
expression will be looked up by clang during expression parsing. For
function parameters this will be in Sema::ActOnParamDeclarator(...) in order to
catch redeclarations of parameters. The names are not needed and we have seen
some rare cases where since we don't have symbols we end up in
SymbolContext::FindBestGlobalDataSymbol(...) which may conflict with other global
symbols.
There may be a way to make this lookup smarter to avoid these cases but it is
not clear how well tested this path is and how much work it would be to fix it.
So we will go with this fix while we investigate more.
Ref: rdar://78265641
As discussed in the post-commit comments for:
3cdd05e519
It seems to be safe to propagate all flags from the final fneg
except for 'nsz' to the new select:
https://alive2.llvm.org/ce/z/J_APDc
nsz has unique FMF semantics: it is not poison, it is only
"insignificant" in the calculation according to the LangRef.
trusty.cpp and trusty.h define Trusty implementations of map and other
platform-specific functions. In addition to adding Trusty configurations
in allocator_config.h and size_class_map.h, MapSizeIncrement and
PrimaryEnableRandomOffset are added as configurable options in
allocator_config.h.
Background on Trusty: https://source.android.com/security/trusty
Differential Revision: https://reviews.llvm.org/D103578
When executing a script command in HandleCommand(s) we currently print
its output twice
You can see this issue in action when adding a breakpoint command:
(lldb) b main
Breakpoint 1: where = main.out`main + 13 at main.cpp:2:3, address = 0x0000000100003fad
(lldb) break command add 1 -o "script print(\"Hey!\")"
(lldb) r
Process 76041 launched: '/tmp/main.out' (x86_64)
Hey!
(lldb) script print("Hey!")
Hey!
Process 76041 stopped
The issue is caused by HandleCommands using a temporary
CommandReturnObject and one of the commands (`script` in this case)
setting an immediate output stream. This causes the result to be printed
twice: once directly to the immediate output stream and once when
printing the result of HandleCommands.
This patch fixes the issue by introducing a new option to suppress
immediate output for temporary CommandReturnObjects.
Differential revision: https://reviews.llvm.org/D103349
If we cannot otherwise use a VMOVimm/VMOVFPimm/VMVNimm, fall back to
producing a VDUP(const) as opposed to a constant pool load. This will at
least be smaller codesize and can allow the VDUP to be folded into other
instructions.
Differential Revision: https://reviews.llvm.org/D103808
- Add `-enable-ocl-mangling-mismatch-workaround` to work around the
mismatch on OCL name mangling so far.
Reviewed By: yaxunl, rampitec
Differential Revision: https://reviews.llvm.org/D103920
The various maps in Symtab lead to some repetative code. This should
improve the situation somewhat.
Differential Revision: https://reviews.llvm.org/D103652
The FileCheck lines in these files are auto-generated and complete,
so there's very little upside (less CHECK lines) from running
-instcombine on them and violating the expected test layering
(optimizer developers shouldn't have to be aware of clang tests).
Running opt passes like this makes it harder to make changes such as:
D93817
Disabling PIC globally also disabled PIC for runtimes which was
undesirable, manually override it.
Differential Revision: https://reviews.llvm.org/D103919
This removes the `__sanitizer_*` allocation function definitions from
`hwasan_interceptors.cpp` and moves them into their own file. This way
implementations that do not use interceptors at all can just ignore
(almost) everything in `hwasan_interceptors.cpp`.
Also remove some unused headers in `hwasan_interceptors.cpp` after the move.
Differential Revision: https://reviews.llvm.org/D103564
This patch https://reviews.llvm.org/D102876 caused some lit regressions on z/OS because tmp files were no longer being opened based on binary/text mode. This patch passes OpenFlags when creating tmp files so we can open files in different modes.
Reviewed By: amccarth
Differential Revision: https://reviews.llvm.org/D103806
G_INSERT legalization is incomplete and doesn't work very
well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES
padding with undef values (although this can get pretty large).
For the case of load/store narrowing, this is still performing the
load/stores in irregularly sized pieces. It might be cleaner to split
this down into equal sized pieces, and rely on load/store merging to
optimize it.
A common mistake for newcomers to MLIR is to try to store extra member
on the Op class. However these are intended to be thing wrapper around
an Operation*, all the storage is meant to be encoded in attribute on
the underlying Operation. This can be confusing to debug, so better
catch it at build time.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D103869
Introduce a new cl::opt to hide "cold" blocks from CFG DOT graphs.
Use BFI to get block relative frequency. Hide the block if the
frequency is below the threshold set by the command line option value.
Reviewed By: davidxl, hoy
Differential Revision: https://reviews.llvm.org/D103640
The comment states the following, for calculating the Line variable:
> Draw a line starting from when we only have 1k left and increasing
> linearly to double the current weight.
However, the value was not calculated as described. Instead, it would
result in a negative value, which resulted in the function always
returning 0 afterwards.
```
// Invariant: CurrentSize <= MaxSize - 200
// Invariant: CurrentWeight >= 0
int Line = (-2 * CurrentWeight) * (MaxSize - CurrentSize + 1000);
// {Line <= 0}
```
This commit fixes the issue and linearly interpolates as described.
Patch by Loris Reiff. Thanks!
Differential Revision: https://reviews.llvm.org/D96207
This implements the 'using enum maybe-qualified-enum-tag ;' part of
1099. It introduces a new 'UsingEnumDecl', subclassed from
'BaseUsingDecl'. Much of the diff is the boilerplate needed to get the
new class set up.
There is one case where we accept ill-formed, but I believe this is
merely an extended case of an existing bug, so consider it
orthogonal. AFAICT in class-scope the c++20 rule is that no 2 using
decls can bring in the same target decl ([namespace.udecl]/8). But we
already accept:
struct A { enum { a }; };
struct B : A { using A::a; };
struct C : B { using A::a;
using B::a; }; // same enumerator
this patch permits mixtures of 'using enum Bob;' and 'using Bob::member;' in the same way.
Differential Revision: https://reviews.llvm.org/D102241
The buffer size (`__nbuf`) in `num_put::do_put` is currently not an
integral/core constant expression. As a result, `__nar` is a Variable Length
Array (VLA). VLAs are a GNU extension and not part of the base C++ standard, so
unless there is good reason to do so they probably shouldn't be used in any of
the standard library headers. The call to `__iob.flags()` is the only thing
keeping `__nbuf` from being a compile time constant, so the solution here is to
simply err on the side of caution and always allocate a buffer large enough to
fit the base prefix.
Note that, while the base prefix for hex (`0x`) is slightly longer than the
base prefix for octal (`0`), this isn't a concern. The difference in the space
needed for the value portion of the string is enough to make up for this.
(Unless we're working with small, oddly sized types such as a hypothetical
`uint9_t`, the space needed for the value portion in octal is at least 1 more
than the space needed for the value portion in hex).
This PR also adds `constexpr` to `__nbuf` to enforce compile time const-ness
going forward.
Reviewed By: Mordante, #libc, Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D103558
When narrowing G_ADD and G_SUB, handle types that aren't a multiple of
the type we're narrowing to. This allows us to handle types like s96
on 64 bit targets.
Note that the test here has a couple of dead instructions because of
the way the setup legalizes. I wasn't able to come up with a way to
write this test that avoids that easily.
Differential Revision: https://reviews.llvm.org/D97811
When narrowing G_INSERT, handle types that aren't a multiple of the
type we're narrowing to. This comes up if we're narrowing something
like an s96 to fit in 64 bit registers and also for non-byte multiple
packed types if they come up.
This implementation handles these cases by extending the extra bits to
the narrow size and truncating the result back to the destination
size.
Differential Revision: https://reviews.llvm.org/D97791
This reverts commit e772216e70
(and fixup 7f6c878a2c).
The build is broken with gcc5 host compiler:
In file included from
from mlir/lib/Dialect/Utils/StructuredOpsUtils.cpp:9:
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: error: type/value mismatch at argument 1 in template parameter list for 'template<class ItTy, class FuncTy, class FuncReturnTy> class llvm::mapped_iterator'
std::function<T(ptrdiff_t)>>;
^
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: note: expected a type, got 'decltype (seq<ptrdiff_t>(0, 0))::const_iterator'
As suggested on rG937c4cffd024, use llvm_unreachable for unhandled integer types (which shouldn't be possible) instead of breaking and dropping down to the existing fatal error handler.
Helps silence static analyzer warnings.
This is both more efficient and more ergonomic than going
through an std::string, e.g. when using llvm::utostr and
in string concat cases.
Unfortunately we can't just overload ::get(). This causes an
ambiguity because both twine and stringref implicitly convert
from std::string.
Differential Revision: https://reviews.llvm.org/D103754
One of the key algorithms used in the "mlir::verify(op)" method is the
dominance checker, which ensures that operand values properly dominate
the operations that use them.
The MLIR dominance implementation has a number of algorithmic problems,
and is not really set up in general to answer dense queries: it's constant
factors are really slow with multiple map lookups and scans, even in the
easy cases. Furthermore, when calling mlir::verify(module) or some other
high level operation, it makes sense to parallelize the dominator
verification of all the functions within the module.
This patch has a few changes to enact this:
1) It splits dominance checking into "IsolatedFromAbove" units. Instead
of building a monolithic DominanceInfo for everything in a module,
for example, it checks dominance for the module to all the functions
within it (noop, since there are no operands at this level) then each
function gets their own DominanceInfo for each of their scope.
2) It adds the ability for mlir::DominanceInfo (and post dom) to be
constrained to an IsolatedFromAbove region. There is no reason to
recurse into IsolatedFromAbove regions since use/def relationships
can't span this region anyway. This is already checked by the time
the verifier gets here.
3) It avoids querying DominanceInfo for trivial checks (e.g. intra Block
references) to eliminate constant factor issues).
4) It switches to lazily constructing DominanceInfo because the trivial
check case handles the vast majority of the cases and avoids
constructing DominanceInfo entirely in some cases (e.g. at the module
level or for many Regions's that contain a single Block).
5) It parallelizes analysis of collections IsolatedFromAbove operations,
e.g. each of the functions within a Module.
All together this is more than a 10% speedup on `firtool` in circt on a
large design when run in -verify-each mode (our default) since the verifier
is invoked after each pass.
Still todo is to parallelize the main verifier pass. I decided to split
this out to its own thing since this patch is already large-ish.
Differential Revision: https://reviews.llvm.org/D103373
The first source has the same EEW as the destination, but we're
using earlyclobber which prevents them from ever being the same
register. This patch attempts to work around this.
-For unmasked .wv, add a special TIED pseudo that pretends like
the first operand and the destination must be the same register. This
disables the earlyclobber for that source. Mark the instruction
as convertible to 3 address form which will switch it to the
original untied pseudo when the TwoAddressInstructionPass decides
that keeping them tied would require an extra copy. This uses
code in RISCVInstrInfo.cpp to do the conversion to the untied
opcode.
The untie test case show that we can generate the untied version.
Not sure it was profitable to do it in this case, but they have
really simple IR.
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D103552