findIndirectCallFunctionSamples will leave Sum uninitialized if it returns an empty vector, we don't really use Sum in this case (but we do make a copy that isn't used either) - so ensure we initialize the value to zero to at least silence the static analysis warning.
These checks are not specific to the instruction based variant of
isPotentiallyReachable(), they are equally valid for the basic
block based variant. Move them there, to make sure that switching
between the instruction and basic block variants cannot introduce
regressions.
Match whats documented in the Intel AOM (and Agner/instlatx64 agree) - vector integer multiplies are pipelined - all Port0, throughput = 2 @ 128bits, 1 @ 64bits.
Noticed while checking reduction costs - now that we can use in-order models in llvm-mca, the atom model is the "worst case scenario" we have in x86.
All the uses that we have for collectBitParts revolve around us matching down to an operation with a single root value - I don't think we're intending to change that (and a lot of collectBitParts assumes it).
The binops cases (OR/FSHL/FSHR) already check if the providers are the same, but that would still mean we waste time collecting through unaryops before getting to them.
Currently we only match bswap intrinsics from or(shl(),lshr()) style patterns when we could often match bitreverse intrinsics almost as cheaply.
Differential Revision: https://reviews.llvm.org/D90170
Reapply rG5ed56a821c06 (after reverted by rG7aa89c4a22fd) - don't take reference from struct that will be erased in X86FrameLowering::eliminateCallFramePseudoInstr
I'm also adding an explicit data layout, so we can
confirm that alignment requirements/prefs are met.
I tried to use complete/scripted CHECK lines here,
but that fails with 1 of the globals, and not sure why.
Use comesBefore() instead of performing an instruction walk. In
line with the previous implementation, instructions are considered
to reach themselves.
The system's network API is in libnetwork.so, so we explicitly need to link to
them on Haiku. This patch is similar to https://reviews.llvm.org/D97633.
Patch by Niels Reedijk. Thanks Niels!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D98405
We are moving from just dense/compressed to more general dim level
types, so we need more than just an "i1" array for annotations.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102520
Adds the ability to pass MCRegisterInfo to dump_pretty and to the print functions,
so that if present, target specific enums names are printed instead of enum values.
The path to the runtime libraries used by the compiler under test
is normally identical to the path where just built libraries are
created. However, this is not necessarily the case when doing standalone
builds. This is because the external compiler used by tests may choose
to get its runtime libraries from somewhere else.
When doing standalone builds there are two types of testing we could be
doing:
* Test the just built runtime libraries.
* Test the runtime libraries shipped with the compile under test.
Both types of testing are valid but it confusingly turns out compiler-rt
actually did a mixture of these types of testing.
* The `test/builtins/Unit/` test suite always tested the just built runtime
libraries.
* All other testsuites implicitly use whatever runtime library the
compiler decides to link.
There is no way for us to infer which type of testing the developer
wants so this patch introduces a new
`COMPILER_RT_TEST_STANDALONE_BUILD_LIBS` CMake
option which explicitly declares which runtime libraries should be
tested. If it is `ON` then the just built libraries should be tested,
otherwise the libraries in the external compiler should be tested.
When testing starts the lit test suite queries the compiler used for
testing to see where it will get its runtime libraries from. If these
paths are identical no action is taken (the common case). If the paths
are not identical then we check the value of
`COMPILER_RT_TEST_STANDALONE_BUILD_LIBS` (progated into the config as
`test_standalone_build_libs`) and check if the test suite supports testing in the
requested configuration.
* If we want to test just built libs and the test suite supports it
(currently only `test/builtins/Unit`) then testing proceeds without any changes.
* If we want to test the just built libs and the test suite doesn't
support it we emit a fatal error to prevent the developer from
testing the wrong runtime libraries.
* If we are testing the compiler's built libs then we adjust
`config.compiler_rt_libdir` to point at the compiler's runtime
directory. This makes the `test/builtins/Unit` tests use the
compiler's builtin library. No other changes are required because
all other testsuites implicitly use the compiler's built libs.
To make the above work the
`test_suite_supports_overriding_runtime_lib_path` test suite config
option has been introduced so we can identify what each test suite
supports.
Note all of these checks **have to be performed** when lit runs.
We cannot run the checks at CMake generation time because
multi-configuration build systems prevent us from knowing what the
paths will be.
We could perhaps support `COMPILER_RT_TEST_STANDALONE_BUILD_LIBS` being
`ON` for most test suites (when the runtime library paths differs) in
the future by specifiying a custom compiler resource directory path.
Doing so is out of scope for this patch.
rdar://77182297
Differential Revision: https://reviews.llvm.org/D101681
GlobalVariables are Constants, yet should not unconditionally be
considered true for __builtin_constant_p.
Via the LangRef
https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic:
This intrinsic generates no code. If its argument is known to be a
manifest compile-time constant value, then the intrinsic will be
converted to a constant true value. Otherwise, it will be converted
to a constant false value.
In particular, note that if the argument is a constant expression
which refers to a global (the address of which _is_ a constant, but
not manifest during the compile), then the intrinsic evaluates to
false.
Move isManifestConstant from ConstantFolding to be a method of
Constant so that we can reuse the same logic in
LowerConstantIntrinsics.
pr/41459
Reviewed By: rsmith, george.burgess.iv
Differential Revision: https://reviews.llvm.org/D102367
The FixSGPRCopies pass converts instructions to VALU when
removing illegal VGPR to SGPR copies. Instructions that use SCC
are changed to use VCC instead. When that happens, the pass must
also change instructions that define SCC to define VCC.
The pass was not changing the SCC definition when an ADDC is
converted due to a input that is a VGPR to SGPR copy. But, the
initial ADD insruction, which define SCC, is not converted.
This causes a compilation failure due to a use of an undefined
physical register.
This patch adds code that inserts the SCC definition in the
MoveToVALU worklist when a SCC use is converted to a VCC use.
Differential Revision: https://reviews.llvm.org/D102111
Currently we didn't support multiple return type, we work around to use error_code to represent:
1) The dangling probe.
2) Ignore the weight of non-probe instruction
While merging the instructions' weight for the whole BB, it will filter out the error code. But If all instructions of the BB give error_code, the outside logic will mark it as a BB requiring the inference algorithm to infer its weight. This is different from the zero value which will be treated as a cold block.
Fix one place that if we can't find the FunctionSamples in the profile data which indicates the BB is cold, we choose to return zero.
Also refine the comments.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D102007
Previously, we already used BatchAA for individual simple pointer
dependency queries. This extends BatchAA usage for the non-local
case, so that only one BatchAA instance is used for all blocks,
instead of one instance per block.
Use of BatchAA is safe as IR cannot be modified during a MemDep
query.
This patch adds the abstract class SystemZCallingConventionRegisters
which is a SystemZ-specific class detailing special registers used
by calling conventions on the target. SystemZELFRegisters and
SystemZXPLINK64Registers implement this class for ELF and XPLINK64
respectively.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D102370
It's easy to hit 2**16 limit with i686 GNU toolchains these days.
Clang does it automagically, so it's not needed there, and the option
causes warnings about being unused when linking.
Differential Revision: https://reviews.llvm.org/D102419
Support for Darwin's libsystem_m's vector functions has been added to
LLVM in 93a9a8a8d9.
This patch adds support for -fveclib=Darwin_libsystem_m to Clang.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D102489
This is not expected to have any practical compile-time effect,
as the alias() calls inside callCapturesBefore() are rare. This
should still be supported for API completeness, and might be
useful for reachability caching.