ld64 doesn't warn on builds using `-install_name` if it's a bundle. But, the
current warning is nice to have because `install_name` only works with dylib.
To prevent an overflow of warnings in build logs and have parity with ld64,
create a `--warn-dylib-install-name` and `--warn-no-dylib-install-name` flag
that enables this LLD specific warning.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D113534
When using clangd, it's possible to trigger assertions in
NumericLiteralParser and CharLiteralParser when switching git branches.
This commit removes the initial asserts on invalid input and replaces
those asserts with the error handling mechanism from those respective
classes instead. This allows clangd to gracefully recover without
crashing.
See https://github.com/clangd/clangd/issues/888 for more information
on the clangd crashes.
Don't expand CTTZ if CTPOP or CTLZ is supported on the promoted type.
We have special handling for CTTZ expansion to use those ops with a
small conversion. The setup for that doesn't generate extra code or
large constants so we don't gain anything from expanding early and we
make CTTZ_ZERO_UNDEF codegen worse.
Follow up from post commit feedback on D112268. We don't seem to have
any in tree tests that care about this.
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
Differential Revision: https://reviews.llvm.org/D113965
Transpose conv2d shape inference was incorrect, tests did not properly validate
that the shape inference was executing. Corrected shape inference, and extended
tests to actually execute.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D114026
Teach the HWLOC topology method how to detect Atom and Core
types so hybrid CPUs are properly detected and represented when using
the HWLOC topology method.
Differential Revision: https://reviews.llvm.org/D112270
The current implementation is quite clunky; OperationName stores either an Identifier
or an AbstractOperation that corresponds to an operation. This has several problems:
* OperationNames created before and after an operation are registered are different
* Accessing the identifier name/dialect/etc. from an OperationName are overly branchy
- they need to dyn_cast a PointerUnion to check the state
This commit refactors this such that we create a single information struct for every
operation name, even operations that aren't registered yet. When an OperationName is
created for an unregistered operation, we only populate the name field. When the
operation is registered, we populate the remaining fields. With this we now have two
new classes: OperationName and RegisteredOperationName. These both point to the
same underlying operation information struct, but only RegisteredOperationName can
assume that the operation is actually registered. This leads to a much cleaner API, and
we can also move some AbstractOperation functionality directly to OperationName.
Differential Revision: https://reviews.llvm.org/D114049
The current implementation of Windows Processor Groups has
a separate topology method to handle them. This patch deprecates
that specific method and uses the regular CPUID topology
method by default and inserts the Windows Processor Group objects
in the topology manually.
Notes:
* The preference for processor groups is lowered to a value less than
socket so that the user will see sockets in the KMP_AFFINITY=verbose
output instead of processor groups when sockets=processor groups.
* The topology's capacity is modified to handle additional topology layers
without the need for reallocation.
* If a user asks for a granularity setting that is "above" the processor
group layer, then the granularity is adjusted "down" to the processor
group since this is the coarsest layer available for threads.
Differential Revision: https://reviews.llvm.org/D112273
If some CPUs are offline, then make sure they are not included in the
fullMask even if norespect is given to KMP_AFFINITY.
Differential Revision: https://reviews.llvm.org/D112274
In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here.
Differential Revision: https://reviews.llvm.org/D114017
The name seems to have been left over from a renaming effort on an unexercised
codepaths that are difficult to catch in Python. Fix it and add a test that
exercises the codepath.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D114004
We've stopped doing it in libc++ for a while now because these names
would end up rotting as we move things around and copy/paste stuff.
This cleans up all the existing files so as to stop the spreading
as people copy-paste headers around.
Remove restriction forcing users to specify the KMP_HW_SUBSET value in
topology order. This patch sorts the user KMP_HW_SUBSET value before
trying to apply it. For example: 1s,4c,2t is equivalent to 2t,1s,4c
Differential Revision: https://reviews.llvm.org/D112027
distutils is deprecated and will be removed, so we shouldn't be
using it.
We were using it to compute LLDB_PYTHON_RELATIVE_PATH.
Discussing a similar issue
[at python.org](https://bugs.python.org/issue41282), Filipe Laíns said:
If you are relying on the value of distutils.sysconfig.get_python_lib()
as you shown in your system, you probably don't want to. That
directory (dist-packages) should be for Debian provided packages
only, so moving to sysconfig.get_path() would be a good thing,
as it has the correct value for user installed packages on your
system.
So I propose using a relative path from `sys.prefix` to
`sysconfig.get_path("platlib")` instead.
On Mac and windows, this results in the same paths as we had before,
which are `lib/python3.9/site-packages` and `Lib\site-packages`,
respectively.
On ubuntu however, this will change the path from
`lib/python3/dist-packages` to `lib/python3.9/site-packages`.
This change seems to be correct, as Filipe said above, `dist-packages`
belongs to the distribution, not us.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D114106
This solves the same crash as in D104503, but with a different approach.
The test case test_non_dom demonstrates a case where scev-aa crashes today. (If exercised either by -eval-aa or -licm.) The basic problem is that SCEV-AA expects to be able to compute a pointer difference between two SCEVs for any two pair of pointers we do an alias query on. For (valid, but out of scope) reasons, we can end up asking whether expressions in different sub-loops can alias each other. This results in a subtraction expression being formed where neither operand dominates the other.
The approach this patch takes is to leverage the "defining scope" notion we introduced for flag semantics to detect and disallow the formation of the problematic SCEV. This ends up being relatively straight forward on that new infrastructure. This change does hint that we should probably be verifying a similar property for all SCEVs somewhere, but I'll leave that to a follow on change.
Differential Revision: D114112
The __flags variable needs to be of type 'long' in order to get sign extended
properly.
internal_clone() uses an svc (Supervisor Call) directly (as opposed to
internal_syscall), and therefore needs to take care to set errno and return
-1 as needed.
Review: Ulrich Weigand
Fix the fact that previously strtof/d/ld would only accept a NaN as
having parentheses if the thing in the parentheses was a valid number,
now it will accept any combination of letters and numbers, but will only
put valid numbers in the mantissa.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D113790
If we're rotating vXi8 by a splatted amount, then unpack to vXi16, perform a SHL by the (extended) scalar, and then pack the results.
This is a vector equivalent to the "rotl(x,y) -> (((aext(x) << bw) | zext(x)) << (y & (bw-1))) >> bw" style expansion we do for scalars in LowerFunnelShift.
I think we can usefully use this for other vector types and vector funnel-shifts in the future, depending how we expand beyond D113192 for matching rotations/funnel-shifts for more type/ops.
With this,
void f() { __asm__("mov eax, ebx"); }
now compiles with clang with -masm=intel.
This matches gcc.
The flag is not accepted in clang-cl mode. It has no effect on
MSVC-style `__asm {}` blocks, which are unconditionally in intel
mode both before and after this change.
One difference to gcc is that in clang, inline asm strings are
"local" while they're "global" in gcc. Building the following with
-masm=intel works with clang, but not with gcc where the ".att_syntax"
from the 2nd __asm__() is in effect until file end (or until a
".intel_syntax" somewhere later in the file):
__asm__("mov eax, ebx");
__asm__(".att_syntax\nmovl %ebx, %eax");
__asm__("mov eax, ebx");
This also updates clang's intrinsic headers to work both in
-masm=att (the default) and -masm=intel modes.
The official solution for this according to "Multiple assembler dialects in asm
templates" in gcc docs->Extensions->Inline Assembly->Extended Asm
is to write every inline asm snippet twice:
bt{l %[Offset],%[Base] | %[Base],%[Offset]}
This works in LLVM after D113932 and D113894, so use that.
(Just putting `.att_syntax` at the start of the snippet works in some but not
all cases: When LLVM interpolates in parameters like `%0`, it uses at&t or
intel syntax according to the inline asm snippet's flavor, so the `.att_syntax`
within the snippet happens to late: The interpolated-in parameter is already
in intel style, and then won't parse in the switched `.att_syntax`.)
It might be nice to invent a `#pragma clang asm_dialect push "att"` /
`#pragma clang asm_dialect pop` to be able to force asm style per snippet,
so that the inline asm string doesn't contain the same code in two variants,
but let's leave that for a follow-up.
Fixes PR21401 and PR20241.
Differential Revision: https://reviews.llvm.org/D113707
- Replace irrelevant synopsis by a comment
- Use a .verify.cpp test instead of .compile.fail.cpp
- Remove unnecessary includes in one of the tests (was a copy-paste error)
Differential Revision: https://reviews.llvm.org/D114094
This is preparation for D113707, where I want to make `-masm=intel`
emit `asm inteldialect` instructions.
`{movq %rbx, %rax|mov rax, rbx}` is supposed to evaluate to the bit
between { and | for att and to the bit between | and } for intel.
Since intel will become `asm inteldialect`, which alls EmitMSInlineAsmStr(),
EmitMSInlineAsmStr() has to support variants as well.
(clang translates `{...|...}` to `$(...$|...$)`. I'm not sure why
it doesn't just send along only the first `...` or the second `...`
to LLVM, but given the notes in PR23933 let's not do a big
reorganization in this codepath.)
Differential Revision: https://reviews.llvm.org/D113932
If we have a large enough floating point type that can exactly
represent the integer value, we can convert the value to FP and
use the exponent to calculate the leading/trailing zeros.
The exponent will contain log2 of the value plus the exponent bias.
We can then remove the bias and convert from log2 to leading/trailing
zeros.
This doesn't work for zero since the exponent of zero is zero so we
can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need
a value for zero we can use a vmseq and a vmerge to handle it.
We need to be careful to make sure the floating point type is legal.
If it isn't we'll continue using the integer expansion. We could split the vector
and concatenate the results but that needs some additional work and evaluation.
Differential Revision: https://reviews.llvm.org/D111904
`asm` always has AT&T-style input (`asm inteldialect` has Intel-style asm
input), so EmitGCCInlineAsmStr() always has to pick the same variant since it
cares about the input asm string, not the output asm string.
For PowerPC, that default variant is 1. For other targets, it's 0.
Without this, the included test case errors out with
error: unknown use of instruction mnemonic without a size suffix
mov rax, rbx
since it picks the intel branch and then tries to interpret it as AT&T
when selecting intel-style output with `-x86-asm-syntax=intel`.
Differential Revision: https://reviews.llvm.org/D113894
This will drop file version information from span names, reducing
overall cardinality and also effect logging when skipping actions in scheduler.
Differential Revision: https://reviews.llvm.org/D113390
Fortran defines LEN(X) = 0 after CHARACTER(LEN=-1)::X so
apply MAX(0, ...) to character length expressions.
Differential Revision: https://reviews.llvm.org/D114030