Currently we create offloading entries to register device variables with
the host. When we register a variable we will look up the symbol in the
device image and map the device address to the host address. This is a
problem when the symbol is declared with hidden visibility or internal
linkage. This means the symbol is not accessible externally and we
cannot get its address. We should still allow static variables to be
declared on the device, but ew should not create an offloading entry for
them so they exist independently on the host and device.
Fixes#54309
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122352
This provides a way to create an operation without manipulating
OperationState directly. This is useful for creating unregistered ops.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D120787
This patch adds lowering tests for Fortran
interfaces.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D122326
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Add more use cases of lowering of derived types.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D122329
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Peter Steinfeld <psteinfeld@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Use `ProfileSummaryBuilder::DefaultCutoffs` for llvm-profdata detailed summary printing for Instr profile.
Differential Revision: https://reviews.llvm.org/D122210
Only use this on generic parser for now by not registering any dialect. For flushing out some parser bugs. The textual format is not meant to be load bearing in production runs, but still useful to remove edge cases/failures.
Differential Revision: https://reviews.llvm.org/D122267
Complete pseudo probes decoding can result in large memory usage. In practice only a small porting of the decoded probes are used in profile generation. I'm changing the full decoding mode to be decoding for profiled functions only, though we still do a full scan of the .pseudoprobe section due to a missing table-of-content but we don't have to build the in-memory data structure for functions not sampled.
To build the in-memory data structure for profiled functions only, I'm rewriting the previous non-recursive probe decoding logic to be recursive. This is easy to read and maintain.
I also have to change the previous representation of unsymbolized context from probe-based stack to address-based stack since the profiled functions are unknown yet by the time of virtual unwinding. The address-based stack will be converted to probe-based stack after virtual unwinding and on-demand probe decoding.
I'm seeing 20GB memory is saved for one of our internal large service.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D121643
This improves the getHostCPUName check for Apple M1 CPUs, which
previously would always be considered cyclone instead. This also enables
`-march=native` support when building on M1 CPUs which would previously
fail. This isn't as sophisticated as the X86 CPU feature checking which
consults the CPU via getHostCPUFeatures, but this is still better than
before. This CPU selection could also be invalid if this was run on an
iOS device instead, ideally we can improve those cases as they come up.
Differential Revision: https://reviews.llvm.org/D119788
Clang fails to diagnose:
```
void test() {
int j = 0;
for (int i = 0; i < 1000; i++)
j++;
return;
}
```
Reason: Missing support for UnaryOperator.
We should not warn with volatile variables... so add check for it.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D122271
All existing loader tests are switched to an integration test added with
the new rule. Also, the getenv test is now enabled as an integration test.
All loader tests have been moved to test/integration. Also, the simple
checker library for the previous loader tests has been moved to a
separate directory of its own.
A follow up change will perform more cleanup of the loader CMake rules
to eliminate now redundent options.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/D122266
In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
some portion of these stall cycles with s_nops.
Fixes: SWDEV-326925
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D121437
There is potential for endless recursion if we try to determine the
underlying objects of a load, just to end up with the load as underlying
object. A proper solution will require us to pass a visited set around.
This will happen as we cleanup genericValueTraversal soon.
macOS 12.3 no longer ships non-3 python.
Almost all of these scripts were launched by ninja, and the GN files
already told it to run them under python3, so this is a fairly small
change. The main effect is that if you run them manually, you now
get the same behavior.
(A small set of scripts, gn.py, gen.py, sync_source_lists_from_cmake.py,
are for manual running. For these, it is an actual change.)
Differential Revision: https://reviews.llvm.org/D122345
Emitting at error at EOF will emit the diagnostic past the end of the file. When emitting an error during parsing at EOF, emit it at the previous character.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D122295
It behaves (mostly) like the LLVM_INSTALL_CCTOOLS_SYMLINKS option
in cmake.
The minor difference is that the llvm-objcopy symlinks bitcode_strip
and install_name_tool symlink to llvm-objcopy directly in the GN build,
while it's a bitcode_strip -> llvm-bitcode-strip -> objcopy chain
in the CMake build (and analogous for install_name_tool).
The implementation is very similar to the implementation of the
existing llvm_install_binutils_symlinks arg.
Differential Revision: https://reviews.llvm.org/D122312
When --disable-sample-loader-inlining is true, skip inline transformation, but merge profiles of inlined instances to outlining versions.
Differential Revision: https://reviews.llvm.org/D121862
This patch adds a lightweight assertion handler mechanism that can be
overriden at link-time in a fashion similar to `operator new`.
This is a third take on https://llvm.org/D121123 (which allowed customizing
the assertion handler at compile-time), and https://llvm.org/D119969
(which allowed customizing the assertion handler at runtime only).
This approach is, I think, the best of all three explored approaches.
Indeed, replacing the assertion handler in user code is ergonomic,
yet we retain the ability to provide a custom assertion handler when
deploying to older platforms that don't have a default handler in
the dylib.
As-is, this patch provides a pretty good amount of backwards compatibility
with the previous debug mode:
- Code that used to set _LIBCPP_DEBUG=0 in order to get basic assertions
in their code will still get basic assertions out of the box, but
those assertions will be using the new assertion handler support.
- Code that was previously compiled with references to __libcpp_debug_function
and friends will work out-of-the-box, no changes required. This is
because we provide the same symbols in the dylib as we used to.
- Code that used to set a custom __libcpp_debug_function will stop
compiling, because we don't provide that declaration anymore. Users
will have to migrate to the new way of setting a custom assertion
handler, which is extremely easy. I suspect that pool of users is
very limited, so breaking them at compile-time is probably acceptable.
The main downside of this approach is that code being compiled with
assertions enabled but deploying to an older platform where the assertion
handler didn't exist yet will fail to compile. However users can easily
fix the problem by providing a custom assertion handler and defining
the _LIBCPP_AVAILABILITY_CUSTOM_ASSERTION_HANDLER_PROVIDED macro to
let the library know about the custom handler. In a way, this is
actually a feature because it avoids a load-time error that one would
otherwise get when trying to run the code on the older target.
Differential Revision: https://reviews.llvm.org/D121478
These are based on tests that were included in the abandoned D122299.
Comments indicate what should or should not happen if we change
behavior in Negator.
This patch introduces 2 new lldb utility functions:
- lldbutil.start_listening_from: This can be called in the test setup to
create a listener and set it up for a specific event mask and add it
to the user-provided broadcaster's list.
- lldbutil.fetch_next_event: This will use fetch a single event from the
provided istener and return it if it matches the provided broadcaster.
The motivation behind this is to easily test new kinds of events
(i.e. Swift type-system progress events). However, this patch also
updates `TestProgressReporting.py` and `TestDiagnosticReporting.py`
to make use of these new helper functions.
Differential Revision: https://reviews.llvm.org/D122193
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second
time. They are already in EltSize or SrcEltSize variables.
Refactor some comparisons to use multiply instead of division.
Before actually executing the ExtractAPIAction, clear the
CompilationInstance's input list and replace it with a single
synthesized file that just includes (or imports in ObjC) all the inputs.
Depends on D122141
Differential Revision: https://reviews.llvm.org/D122175
CoroSplit lowers various coroutine intrinsics. It's a CGSCC pass and
CGSCC passes don't run on unreachable functions. Normally GlobalDCE will
come along and delete unreachable functions, but we don't run GlobalDCE
under -O0, so an unreachable function with coroutine intrinsics may
never have CoroSplit run on it.
This patch adds GlobalDCE when coroutines intrinsics are present. It
also now runs all coroutine passes conditional when coroutine intrinsics
are present. This should also solve the -O0 regression reported in
D105877 due to LazyCallGraph construction.
Fixes https://github.com/llvm/llvm-project/issues/54117
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D122275
`config->priorities` has been used to hold the intermediate state during the construction of the order in which sections should be laid out. This is not a good place to hold this state since the intermediate state is not a "configuration" for LLD. It should be encapsulated in a class for building a mapping from section to priority (which I created in this diff as the `PriorityBuilder` class).
The same thing is being done for `config->callGraphProfile`.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D122156
Currently, Clang handles some qualifiers correctly for __auto_type, but
it does not handle the restrict or _Atomic qualifiers in the same way
that GCC does. This patch handles those qualifiers so that they attach
to the deduced type the same as const and volatile already do.
This fixes https://github.com/llvm/llvm-project/issues/53652