The RegionPrinter, RegionOnlyPrinter, RegionViewer and RegionOnlyViewer passes have not yet been ported to the new pass manager.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D119897
Aligned new does not require size to be a multiple of alignment, so
memalign is the correct choice instead of aligned_alloc.
Fixes false reports for unaligned sizes.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D119161
his patch adds builtins and intrinsics for the f16 and f16x2 variants of the ex2
instruction.
These two variants were added in PTX7.0, and are supported by sm_75 and above.
Note that this isn't wired with the exp2 llvm intrinsic because the ex2
instruction is only available in its approx variant.
Running ptxas on the assembly generated by the test f16-ex2.ll works as
expected.
Differential Revision: https://reviews.llvm.org/D119157
Currently when we generate OpenMP offloading code we always make
fallback code for the CPU. This is necessary for implementing features
like conditional offloading and ensuring that unhandled pragmas don't
result in missing symbols. However, this is problematic for a few cases.
For offloading tests we can silently fail to the host without realizing
that offloading failed. Additionally, this makes it impossible to
provide interoperabiility to other offloading schemes like HIP or CUDA
because those methods do not provide any such host fallback guaruntee.
this patch adds the `-fopenmp-offload-mandatory` flag to prevent
generating the fallback symbol on the CPU and instead replaces the
function with a dummy global and the failed branch with 'unreachable'.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D120353
The stack trace addresses may be odd (normally addresses should be even), but
seems a good compromise when the instruction length (2,4,6) cannot be detected
easily.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D120432
Based on the discussion in D115393, I've updated the names to be more
descriptive.
Reviewed By: ellis, MaskRay
Differential Revision: https://reviews.llvm.org/D120092
Internally to DAGCombiner the SDValues were passed by non-const
reference despite not being modified. They were then passed by
const reference to TLI.
This patch passes them by value which is consistent with the vast
majority of code.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D120420
In combineCarryDiamond() use getAsCarry() to find more candidates for being a carry flag.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D118362
Jim noticed that the regex command is unintentionally recursive. Let's
use the following command regex as an example:
(lldb) com regex humm 's/([^ ]+) ([^ ]+)/p %1 %2 %1 %2/'
If we call it with arguments foo bar, thing behave as expected:
(lldb) humm foo bar
(...)
foo bar foo bar
However, if we include %2 in the arguments, things break down:
(lldb) humm fo%2o bar
(...)
fobaro bar fobaro bar
The problem is that the implementation of the substitution is too naive.
It substitutes the %1 token into the target template in place, then does
the %2 substitution starting with the resultant string. So if the
previous substitution introduced a %2 token, it would get processed in
the second sweep, etc.
This patch addresses the issue by walking the command once and
substituting the % variables in place.
(lldb) humm fo%2o bar
(...)
fo%2o bar fo%2o bar
Furthermore, this patch also reports an error if not enough variables
were provided and add support for substituting %0.
rdar://81236994
Differential revision: https://reviews.llvm.org/D120101
This test doesn't work because the CHECK-NOT line is actually checking
something that only exists on stderr and not stdout.
Changed the test so that we now check both stderr and stdout.
Changed the test so that we check pwr9, pwr10, and future. The cpu names of
power9 or power10 are not supported in the llc backend.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D120349
By specifying a sectionMemoryMapper, users can control how
memory for JIT code is allocated.
In particular, I need this in order to use a named memory
region so that profilers such as perf(1) can correctly label
execution cycles coming from JIT'ed code.
Reviewed-by: ezhulenev
Differential Revision: https://reviews.llvm.org/D120415
It's customary for these options to have the -fno- form which is sometimes
handy to work around issues. Using the supported driver option is preferred over
the internal cl::opt option `-mllvm -asan-globals-live-support=0`
Reviewed By: kstoimenov, vitalybuka
Differential Revision: https://reviews.llvm.org/D120391
There should be 1-bit unused field between tid field and is_atomic field of Shadow.
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D119417
This change uses instruction's comesBefore method to simplify the code significantly. There's little compile time concern here because getSpillCost already calls comesBefore on every basic block which contains a vectorization candidate. The only additional times we'll build basic block ordering is when we can't schedule a vector candidate anywhere in the containing block.
Differential Revision: https://reviews.llvm.org/D120364
The unique_ptr_ret and weak_ptr_ret tests are not expected to pass on
AIX. These tests check that unique_ptr and weak_ptr are returned by
value, but on AIX, all structs are always returned by reference.
```
3.9.6 Function Return Values
...
Note: Structures of any length and character strings longer than four
bytes are returned in a storage buffer allocated by the caller. The
address of this buffer is passed as a hidden first argument in GPR3,
which causes the first explicit argument word to be passed in GPR4. This
hidden argument is treated as a formal argument and corresponds to the
first word of the argument area.
```
Reviewed By: #powerpc, daltenty, #libc, Quuxplusone, philnik
Differential Revision: https://reviews.llvm.org/D119952
Prior to this change, LLVM would attempt to optimize an
aligned_alloc(33, ...) call to the stack. This flunked an assertion when
trying to emit the alloca, which crashed LLVM. Avoid that with extra
checks.
Differential Revision: https://reviews.llvm.org/D119604
There should be 1-bit unused field between tid field and is_atomic field of Shadow.
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D119417
Given a cmpf of either uitofp or sitofp and a constant, attempt to canonicalize it to a cmpi.
This PR rewrites equivalent code within LLVM to now apply to MLIR arith.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117257
This patch introduce basic function/subroutine calls.
Because of the state of lowering only simple scalar arguments
can be used in the calls. This will be enhanced in follow up
patches with arrays, allocatable, pointer ans so on.
```
subroutine sub1()
end
subroutine sub2()
call sub1()
end
```
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D120419
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
+ compare block size with the unrollable inner dimension
+ reduce nesting in the code and simplify a bit IR building
Reviewed By: cota
Differential Revision: https://reviews.llvm.org/D120075
In GNU ld, the definition precedence is: regular symbol assignment > relocatable object definition > `PROVIDE` symbol assignment.
GNU ld's internal linker scripts define the non-reserved (by C and C++)
edata/end/etext with `PROVIDE` so the relocatable object definition takes
precedence. This makes sense because `int end;` is valid.
We currently redefine such symbols if they are COMMON, but not if they are
regular definitions, so `int end;` with -fcommon is essentially a UB in ld.lld.
Fix this (also improve consistency and match GNU ld) by using the
`isDefined` code path for `isCommon`. In GNU ld, reserved identifiers like
`__ehdr_start` do not use `PROVIDE`, while we treat them all as `PROVIDE`, this
seems fine.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D120389
Support to load debug info from dwarf split file, like .dwo, .dwp files. Leverage the `getNonSkeletonUnitDIE(false)` API to achieve this.
Add test cause to make sure all the ranges is well retrieved by the loader.
Reviewed By: ayermolo, hoy, wenlei
Differential Revision: https://reviews.llvm.org/D115973