This patch combines the `emitParallel` logic prototyped in D61953 with
the OpenMPIRBuilder (D69785) and introduces `CreateParallel`.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D70109
Summary: The specific number of records loaded depends on the number of kinds, but the difference between the lazy and not lazy cases does not.
Reviewers: modocache
Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71730
As a permanent and generic solution to the problem of variable
finalization (destructors, lastprivate, ...), this patch introduces the
finalization stack. The objects on the stack describe (1) the
(structured) regions the OpenMP-IR-Builder is currently constructing,
(2) if these are cancellable, and (3) the callback that will perform the
finalization (=cleanup) when necessary.
As the finalization can be necessary multiple times, at different source
locations, the callback takes the position at which code is currently
generated. This position will also encode the destination of the "region
exit" block *iff* the finalization call was issues for a region
generated by the OpenMPIRBuilder. For regions generated through the old
Clang OpenMP code geneneration, the "region exit" is determined by Clang
inside the finalization call instead (see getOMPCancelDestination).
As a first user, the parallel + cancel barrier interaction is changed.
In contrast to the temporary solution before, the barrier generation in
Clang does not need to be aware of the "CancelDestination" block.
Instead, the finalization callback is and, as described above, later
even that one does not need to be.
D70109 will be updated to use this scheme.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D70258
In the worst case, this requires a 128-bit move instruction to
implicitly zero the upper bits. In the common case, we should
recognize the producing instruction already zeroed the upper bits.
Summary:
We're already scanning forward through the basic block. Might as
well just remember eflags defs instead of doing a bounded search
backwards later.
Based on a comment in D71841.
Reviewers: RKSimon, spatel, uweigand
Reviewed By: uweigand
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71865
This cleans up and merges `gnu-symbols.test` to `symbols.test`.
Initially `gnu-symbols.test` tested the following things:
1) How symbols are printed in GNU style.
It does not make sense to have a separate file for such tests.
2) It tried to test proc-specific symbol indexes. The test was incomplete and
also we already have `symbol-shndx.test` for that, so this part was removed.
3) It tested `--dyn-symbols` and `--symbols` correlation. All following
cases were moved to `symbols.test`:
a) That `--dyn-symbols` does not trigger showing regular symbols..
b) That `--symbols` triggers `--dyn-symbols` implicitly.
c) That `--dyn-symbols` and `--symbols` works fine together.
Differential revision: https://reviews.llvm.org/D71697
Summary:
As discussed in D71799, we have found that it is more useful to reach an optimistic fixpoint in AAValueSimpify when the value is constant or undef.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: baziotis, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71852
Summary:
Follow-up on: https://reviews.llvm.org/D71435
We basically use `checkForAllInstructions` to loop through all the instructions in a function that access memory through a pointer: load, store, atomicrmw, atomiccmpxchg
Note that we can now use the `getPointerOperand()` that gets us the pointer operand for an instruction that belongs to the aforementioned set.
Question: This function returns `nullptr` if the instruction is `volatile`. Why?
Guess: Because if it is volatile, we don't want to do any transformation to it.
Another subtle point is that I had to add AtomicRMW, AtomicCmpXchg to `initializeInformationCache()`. Following `checkAllInstructions()` path, that
seemed the most reasonable place to add it and correct the fact that these instructions were ignored (they were not in `OpcodeInstMap` etc.). Is that ok?
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert, sstefan1
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71787
_Eventually_, this attribute will be assigned to a function if it
contains undefined behavior. As a first small step, I tried to make it
loop through the load instructions in a function (eventually, the plan
is to check if a load instructions causes undefined behavior, because
e.g. dereferences a null pointer - Also eventually, this won't happen in
initialize() but in updateImpl()).
Patch By: Stefanos Baziotis (@baziotis)
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D71435
All of
"no-frame-pointer-elim-non-leaf"
"no-frame-pointer-elim-non-leaf"="true"
"no-frame-pointer-elim-non-leaf"="false"
mean "frame-pointer"="non-leaf", which is quite counter-intuitive.
llvmorg-10-init-16046-ga36ddf0aa9d accidentally broke it.
This fixes the -DLLVM_ENABLE_EXPENSIVE_CHECKS=On test:
```
*** Bad machine code: Non-flag-setting Thumb1 mov is v6-only ***
- function: pass_C
- basic block: %bb.0 entry (0x1fc9bf0)
- instruction: $r0 = tMOVr killed $r6, 14, $noreg
```
Summary:
Fix the behavior of StringRef::count(StringRef) to not count overlapping occurrences, as is stated in the documentation.
Fixes bug https://bugs.llvm.org/show_bug.cgi?id=44072
I added Krzysztof Parzyszek to review this change because a use of this function in HexagonInstrInfo::getInlineAsmLength might depend on the overlapping-behavior. I don't have enough domain knowledge to tell if this change could break anything there.
All other uses of this method in LLVM (besides the unit tests) only use single-character search strings. In those cases, search occurrences can not overlap anyway.
Patch by Benno (@Bensge)
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D70585
On 32-bit targets we can't use the scalar instruction so we
insert the scalar into a vector and use packed conversions.
Previously we used either v4f32->v4i64 or v4f64->v4i64 to avoid
some complexity creating target specific ISD opcodes for
v4f32->v2i64. But this causes extra vzeroupper instructions and
possibly frequency throttling on Intel CPUs.
This patch changes this to create a 128-bit vector and uses a
target specific ISD opcode if needed.
m_scratch_ast_source_up is only used by ClangASTContextForExpressions so it
should also be declared only in that class. Also make all other members of
ClangASTContext private and move the initialization code for ClangASTContextForExpressions
into the constructor.
Replace tidy::utils::lexer::getConstQualifyingToken with a corrected and also
generalized to other qualifiers variant - getQualifyingToken.
Fixes PR44326