In D125283, we ensured that integer distributions would not compile when
used with arbitrary unsupported types. This effectively enforced what
the Standard mentions here: http://eel.is/c++draft/rand#req.genl-1.5.
However, this also had the effect of breaking some users that were
using integer distributions with unsupported types like int8_t. Since we
already support using __int128_t in those distributions, it is reasonable
to also support smaller types like int8_t and its unsigned variant. This
commit implements that, adds tests and documents the extension. Note that
we voluntarily don't add support for instantiating these distributions
with bool and char, since those are not integer types. However, it is
trivial to replace uses of these random distributions on char using int8_t.
It is also interesting to note that in the process of adding tests
for smaller types, I discovered that our distributions sometimes don't
provide as faithful a distribution when instantiated with smaller types,
so I had to relax a couple of tests. In particular, we do a really bad
job at implementing the negative binomial, geometric and poisson distributions
for small types. I think this all boils down to the algorithm we use in
std::poisson_distribution, however I am running out of time to investigate
that and changing the algorithm would be an ABI break (which might be
reasonable).
As part of this patch, I also added a mitigation for a very likely
integer overflow bug we were hitting in our tests in negative_binomial_distribution.
I also filed http://llvm.org/PR56656 to track fixing the problematic
distributions with int8_t and uint8_t.
Supersedes D125283.
Differential Revision: https://reviews.llvm.org/D126823
This patch adds the use of the `-internalize-public-api-file` option in
the internalization pass to internalize any definition that isn't
explicitly needed for the interface. This will allow us to perform more
optimizations on the file that normally would not have been possible
with functions internal to the library not being internal.
Depends on D130293
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D130298
The internalize pass supports an option to provide a list of symbols
that should not be internalized. THis is useful retaining certain
defintions that should be kept alive. However, this interface is
somewhat difficult to use as it requires knowing every single symbol's
name and specifying it. Many APIs provide common prefixes for the
symbols exported by the library, so it would make sense to be able to
match these using a simple glob pattern. This patch changes the handling
from a simple string comparison to a glob pattern match.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D130319
Currently the bitcode library is build using the clang front-end
manually. This was originally done because we did not support device
only compilation. Now we support device only compilation, at least for a
single offloading toolchain, so we can instead use clang directly rather
than using the front-end. This saves us needing to define things like
`aux_triple`.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D130293
This is useful for building small test cases and will be utilized in a subsequent commit that adds a fusion example.
Differential Revision: https://reviews.llvm.org/D130344
This op fuses a given payload op into a given container op. Inside the container, all uses of the producer are replaced (fused) with the newly inserted op. If the producer is tileable and accessed via a tensor.extract_slice, the new op computes only the requested slice ("tile and fuse"). Otherwise, the entire tensor value is computed inside the container ("clone and fuse").
Differential Revision: https://reviews.llvm.org/D130244
Convert arith.cmpi to the canonical form with constants on the right side
to simplify further optimizations and open more opportunities for CSE.
Differential Revision: https://reviews.llvm.org/D129929
Previously you got:
AssertionError: False is not True : Emulation test succeeded.
Which is a bit of a head scratcher. The message is used when
the test fails, not when it succeeds.
DW_OP_skip/DW_OP_bra can move offset to the end of the data, which means that this was the last instruction to execute and the interpreter should terminate.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D130285
Copying the folder keeps the original permissions by default. This
creates problems when the source folder is read-only, e.g. in a
packaging environment.
Then, the copied folder in the build directory is read-only as well.
Later on, with configure_file, ClangConfig.cmake is copied into that
directory (in the build tree), failing when the directory is read-only.
Fix that problem by copying the folder without keeping the original
permissions.
Differential Revision: https://reviews.llvm.org/D130254
The re-land fixes module map module dependencies seen on Greendragon, but
not in the clang test suite.
---
Currently we only implement this for the Itanium ABI since the correct
mangling for the initializers in other ABIs is not yet known.
Intended result:
For a module interface [which includes partition interface and implementation
units] (instead of the generic CXX initializer) we emit a module init that:
- wraps the contained initializations in a control variable to ensure that
the inits only happen once, even if a module is imported many times by
imports of the main unit.
- calls module initializers for imported modules first. Note that the
order of module import is not significant, and therefore neither is the
order of imported module initializers.
- We then call initializers for the Global Module Fragment (if present)
- We then call initializers for the current module.
- We then call initializers for the Private Module Fragment (if present)
For a module implementation unit, or a non-module TU that imports at least one
module we emit a regular CXX init that:
- Calls the initializers for any imported modules first.
- Then proceeds as normal with remaining inits.
For all module unit kinds we include a global constructor entry, this allows
for the (in most cases unusual) possibility that a module object could be
included in a final binary without a specific call to its initializer.
Implementation:
- We provide the module pointer in the AST Context so that CodeGen can act
on it and its sub-modules.
- We need to account for module build lines like this:
` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or
` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o`
- in order to do this, we add to ParseAST to set the module pointer in
the ASTContext, once we establish that this is a module build and we
know the module pointer. To be able to do this, we make the query for
current module public in Sema.
- In CodeGen, we determine if the current build requires a CXX20-style module
init and, if so, we defer any module initializers during the "Eagerly
Emitted" phase.
- We then walk the module initializers at the end of the TU but before
emitting deferred inits (which adds any hidden and static ones, fixing
https://github.com/llvm/llvm-project/issues/51873 ).
- We then proceed to emit the deferred inits and continue to emit the CXX
init function.
Differential Revision: https://reviews.llvm.org/D126189
- the grammar ambiguity is eliminated by a guard;
- modify the guard function signatures, now all parameters are folded in
to a single object, avoid a long parameter list (as we will add more
parameters in the near future);
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D130160
This patch adds support for evaluating expressions which reference
a captured `this` from within the context of a C++ lambda expression.
Currently LLDB doesn't provide Clang with enough information to
determine that we're inside a lambda expression and are allowed to
access variables on a captured `this`; instead Clang simply fails
to parse the expression.
There are two problems to solve here:
1. Make sure `clang::Sema` doesn't reject the expression due to an
illegal member access.
2. Materialize all the captured variables/member variables required
to evaluate the expression.
To address (1), we currently import the outer structure's AST context
onto `$__lldb_class`, making the `contextClass` and the `NamingClass`
match, a requirement by `clang::Sema::BuildPossibleImplicitMemberExpr`.
To address (2), we inject all captured variables as locals into the
expression source code.
**Testing**
* Added API test
This is required in preparation for the follow-up patch which adds
support for evaluating expressions from within C++ lambdas. In such
cases we need to materialize variables which are not part of the
current frame but instead are ivars on a 'this' pointer of the current
frame.
Lower F08 shift (shiftl, shiftr, shifta) and combined shift (dshiftl, dshiftr)
intrinsics. The combined shift intrinsics are implemented using the
definitions of shiftl and shiftr as described by the standard.
For non-conformant arguments to the shift intrinsics, the implementation tries
to replicate the behavior of other compilers if most of the other behave
consistently.
Differential Revision: https://reviews.llvm.org/D129316
Lower F08 bit population count intrinsics popcnt, poppar, leadz and trailz. popcnt, leadz and trailz are implemented using the corresponding MLIR math intrinsics. poppar is implemented in terms of popcnt.
Differential Revision: https://reviews.llvm.org/D129584
If a function is non-recursive we only performed intra-procedural
reasoning for reachability (via AA::isPotentiallyReachable). However,
if it is re-entrant that doesn't mean we can't reach. Instead of this
problematic logic in the reachability reasoning we utilize logic in
AAPointerInfo. If a location is for sure written by a function it can
be re-entrant or recursive we know only intra-procedural reasoning is
sufficient.
The existing code doesn't expect dummy values (undef, poison, null-derived
constants etc) as arguments of these intrinsics. However, they can be there
in unreached code. Currently we fail trying to find base for them.
Handle these cases separately. Return null as base for them to be consistent
with the handling in the main algorithm in findBaseDefiningValue.
Differential Revision: https://reviews.llvm.org/D129561
Reviewed By: apilipenko
Add affine.if canonicalization to compose affine.apply ops into its set
and operands. This eliminates affine.apply ops feeding into affine.if
ops.
Differential Revision: https://reviews.llvm.org/D130242
If we have a dominating must-write access we do not need to know the
initial value of some object to perform reasoning about the potential
values. The dominating must-write has overwritten the initial value.