LLVMGetInitializer returns nullptr in case there is no initializer.
There is not much that can be done with nullptr in OCaml, not even
test if it is null. Also, there does not seem to be a C or OCaml API
to test if there is an initializer. So this diff changes
Llvm.global_initializer to return an option.
Reviewed By: whitequark
Differential Revision: https://reviews.llvm.org/D65195
This caused non-deterministic compiler output; see comment on the
code review.
> This patch updates the various IR passes to correctly handle dbg.values with a
> DIArgList location. This patch does not actually allow DIArgLists to be produced
> by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
> Other than that, it should cover every IR pass.
>
> Most of the changes simply extend code that operated on a single debug value to
> operate on the list of debug values in the style of any_of, all_of, for_each,
> etc. Instances of setOperand(0, ...) have been replaced with with
> replaceVariableLocationOp, which takes the value that is being replaced as an
> additional argument. In places where this value isn't readily available, we have
> to track the old value through to the point where it gets replaced.
>
> Differential Revision: https://reviews.llvm.org/D88232
This reverts commit df69c69427.
SYCL compilations initiated by the driver will spawn off one or more
frontend compilation jobs (one for device and one for host). This patch
reworks the driver options to make upstreaming this from the downstream
SYCL fork easier.
This patch introduces a language option to identify host executions
(SYCLIsHost) and a -cc1 frontend option to enable this mode. -fsycl and
-fno-sycl become driver-only options that are rejected when passed to
-cc1. This is because the frontend and beyond should be looking at
whether the user is doing a device or host compilation specifically.
Because the frontend should only ever be in one mode or the other,
-fsycl-is-device and -fsycl-is-host are mutually exclusive options.
Documentation of verify_function is incorrect and that of
const_of_int64 is incomplete.
Reviewed By: whitequark
Differential Revision: https://reviews.llvm.org/D77884
'ForOpIterArgsFolder' can now remove iterator arguments (and corresponding
results) with no use.
Example:
```
%cst = constant 32 : i32
%0:2 = scf.for %arg1 = %lb to %ub step %step iter_args(%arg2 = %arg0, %arg3 = %cst)
-> (i32, i32) {
%1 = addu %arg2, %cst : i32
scf.yield %1, %1 : i32, i32
}
use(%0#0)
```
%arg3 is not used in the block, and its corresponding result `%0#1` has no use,
thus remove the iter argument.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98711
Returning structs directly in LLVM does not necessarily align with the C ABI of
the platform. This might happen to work on Linux but for small structs this
breaks on Windows. With this change, the wrappers work platform independently.
Differential Revision: https://reviews.llvm.org/D98725
As a follow-up to D95691, add new diagnostic groups named
pre-c++N-compat to replace the old diagnostic groups with the standards
listed out explicitly. The old group names are retained for backwards
compatibility.
Added additional information about the SSA like properties
that has to be fulfilled in the bufferization steps.
Differential Revision: https://reviews.llvm.org/D95522
The "path" recorded for timing purposes is only used as a key into a dictionary. It is never used as an actual path to a filesystem API, therefore we should use '/' as the canonical separator so that Unix and Windows machines can share timing data. This also ensures that the lit testing works across platforms.
Reviewed By: jhenderson, jmorse
Differential Revision: https://reviews.llvm.org/D98767
Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.
Differential Revision: https://reviews.llvm.org/D98487
This commit fixes the lowering of `Affine.IfOp` to `SCF.IfOp` in the
presence of yield values. These changes have been made as a part of
`-lower-affine` pass.
Differential Revision: https://reviews.llvm.org/D98760
This adds the cost of an i1 extract and a branch to the cost in
getMemInstScalarizationCost when the instruction is predicated. These
predicated loads/store would generate blocks of something like:
%c1 = extractelement <4 x i1> %C, i32 1
br i1 %c1, label %if, label %else
if:
%sa = extractelement <4 x i32> %a, i32 1
%sb = getelementptr inbounds float, float* %pg, i32 %sa
%sv = extractelement <4 x float> %x, i32 1
store float %sa, float* %sb, align 4
else:
So this increases the cost by the extract and branch. This is probably
still too low in many cases due to the cost of all that branching, but
there is already an existing hack increasing the cost using
useEmulatedMaskMemRefHack. It will increase the cost of a memop if it is
a load or there are more than one store. This patch improves the cost
for when there is only a single store, and hopefully at some point in
the future the hack can be removed.
Differential Revision: https://reviews.llvm.org/D98243
Current SLP pass has this piece of code that inserts a trunc instruction
after the vectorized instruction. In the case that the vectorized instruction
is a phi node and not the last phi node in the BB, the trunc instruction
will be inserted between two phi nodes, which will trigger verify problem
in debug version or unpredictable error in another pass.
This patch changes the algorithm to 'if the last vectorized instruction
is a phi, insert it after the last phi node in current BB' to fix this problem.
This patch adds an optimization path for BUILD_VECTOR nodes where the
majority of the elements are identical. These can be splatted, with the
remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can
be tweaked as required - it is currently conservative. Undef elements
are disregarded when judging the dominance of a particular element. This
allows them to be covered by the splat value.
In addition, vectors of 2 elements are always optimized to a splat (for
the upper element) and an insert at element zero.
This optimization is disabled when optimizing for size.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98700
Split out some of the instructions predicated on the dot2-insts target
feature into a new dot7-insts, in preparation for subtargets that have
some but not all of these instructions. NFCI.
Differential Revision: https://reviews.llvm.org/D98717
This patch reduces the time taken for clang to compile the generated
disassembler for an out-of-tree target with InsnType bigger than 64 bits
from 4m30s to 48s.
D67686 did a similar thing for CodeEmitterGen.
The idea is to tweak the API of the APInt-like InsnType class so that
we don't need so many temporary InsnTypes. This takes advantage of the
rule stated in D52100 that currently "no string of bits extracted
from the encoding may exceeed 64-bits", so we can use uint64_t for some
temporaries.
D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.
Differential Revision: https://reviews.llvm.org/D98046
The idiom:
```
DeclContext::lookup_result R = DeclContext::lookup(Name);
for (auto *D : R) {...}
```
is not safe when in the loop body we trigger deserialization from an AST file.
The deserialization can insert new declarations in the StoredDeclsList whose
underlying type is a vector. When the vector decides to reallocate its storage
the pointer we hold becomes invalid.
This patch replaces a SmallVector with an singly-linked list. The current
approach stores a SmallVector<NamedDecl*, 4> which is around 8 pointers.
The linked list is 3, 5, or 7. We do better in terms of memory usage for small
cases (and worse in terms of locality -- the linked list entries won't be near
each other, but will be near their corresponding declarations, and we were going
to fetch those memory pages anyway). For larger cases: the vector uses a
doubling strategy for reallocation, so will generally be between half-full and
full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth
of space allocated currently and will be 2N pointers with the linked list. So we
break even when there are N=6 entries and slightly lose in terms of memory usage
after that. We suspect that's still a win on average.
Thanks to @rsmith!
Differential revision: https://reviews.llvm.org/D91524
As reported in D96348 <https://reviews.llvm.org/D96348>, the
`Posix/regex_startend.cpp` test `FAIL`s on Solaris because
`REG_STARTEND` isn't defined. It's a BSD extension not present everywhere.
E.g. AIX doesn't have it, too.
Fixed by wrapping the test in `#ifdef REG_STARTEND`.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D98425
This commit makes escapes symmetrical, meaning that having escape
before and after the branching, where parameter is not called on
one of the paths, will have the same effect.
Differential Revision: https://reviews.llvm.org/D98622
Also fix a comment typo, and remove a superfluous "std::" qualififcation
in __libcpp_semaphore_wait_timed for consistency.
This mirrors what was suggested in review of
1773eec692.
Differential Revision: https://reviews.llvm.org/D98015
This patch is to replace the fixed value with expression.
Keep .file section as fixed values as it might be changed. The
remaining sections will hardly be modified. So the Index values
are sequential. By using expression, we can avoid the fixed value
changes in coming patches.
This is a follow-up of patch D97117.
Reviewed By: hubert.reinterpretcast, shchenz
Differential Revision: https://reviews.llvm.org/D98620
1. Generate the mapping for clauses between the parser class and the
corresponding clause kind for OpenMP and OpenACC using tablegen.
2. Add a common function to get the OmpObjectList from the OpenMP
clauses to avoid repetition of code.
Reviewed by: Kiranchandramohan @kiranchandramohan , Valentin Clement @clementval
Differential Revision: https://reviews.llvm.org/D98603
The commit 506df1bbfd introduced
a call to `caml_alloc_initialized_string` which seems to be
unavailable on older OCaml versions. So I'm now switching to
using `caml_alloc_string` and using a `memcpy` after that, as
is done in the rest of the file.
Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/16/builds/7919
Many (but not all) DebugInfo functions are now added to the
OCaml bindings, and rest can be safely added incrementally.
Differential Revision: https://reviews.llvm.org/D90831
BasicAA stores a reference to LoopInfo inside. This imposes an implicit
requirement of keeping it up to date whenever we modify the IR (in particular,
whenever we modify terminators of blocks that belong to loops). Failing
to do so leads to incorrect state of the LoopInfo.
Because general AA does not require loop info updates and provides to API to
update it properly, the users of AA reasonably assume that there is no need to
update the loop info. It may be a reason of bugs, as example in PR43276 shows.
This patch drops dependence of BasicAA on LoopInfo to avoid this problem.
This may potentially pessimize the result of queries to BasicAA.
Differential Revision: https://reviews.llvm.org/D98627
Reviewed By: nikic
This fixes defects in D98223 [lld-macho] implement options -(un)exported_symbol(s_list):
* disallow export of hidden symbols
* verify that whitelisted literal names are defined in the symbol table
* reflect export-status overrides in `nlist` attribute of `N_EXT` or `N_PEXT`
Thanks to @thakis for raising these issues
Differential Revision: https://reviews.llvm.org/D98381
This change doesn't handle TIMEZONE, tm_isdst and leap seconds.
Moved shared code between mktime and gmtime into time_utils.cpp.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D98467
Some parameters to attributes and types rely on special comparison routines other than operator== to ensure equality. This revision adds support for those parameters by allowing them to specify a `comparator` code block that determines if `$_lhs` and `$_rhs` are equal. An example of one of these paramters is APFloat, which requires `bitwiseIsEqual` for bitwise comparison (which we want for attribute equality).
Differential Revision: https://reviews.llvm.org/D98473