Implements parts of:
- P0898R3 Standard Library Concepts
- P1754 Rename concepts to standard_case for C++20, while we still can
Differential Revision: https://reviews.llvm.org/D96577
Currently TypePrinter lumps anonymous classes and unnamed classes in one group "anonymous" this is not correct and can be confusing in some contexts.
Differential Revision: https://reviews.llvm.org/D96807
tables.
This gives a modest AST file size reduction, while also fixing crashes
in cases where the key or data length doesn't fit into 16 bits.
Unfortunately, such situations tend to require huge test cases (such as
more than 16K modules or an overload set with 16K entries), and I
couldn't get a testcase to finish in a reasonable amount of time, so no
test is included for that bugfix.
No functionality change intended (other than the bugfix).
Found a problem in indirect call promotion in sample loader pass. Currently
if an indirect call is promoted for a target, and if the parent function is
inlined into some other function, the indirect call can be promoted for the
same target again. That is redundent which can harm performance and can cause
excessive compile time in some extreme case.
The patch fixes the issue. If a target is promoted for an indirect call, the
patch will write ICP metadata with the target call count being set to 0.
In the later ICP in sample profile loader, if it sees a target has 0 count
for an indirect call, it knows the target has been promoted and won't do
indirect call promotion for the indirect call.
The fix brings 0.1~0.2% performance on our search benchmark.
Differential Revision: https://reviews.llvm.org/D96806
The FileCollector asserts that paths passed to addDirectory are indeed
directories. For that to work, the file needs to actually exist. In the
downstream Swift fork we have tests that remove files during testing,
which resulted in this assertion getting triggered.
This patch provides two major changes:
1. Add getRelocationInfo to check if a constant will have static, dynamic, or
no relocations. (Also rename the original needsRelocation to needsDynamicRelocation.)
2. Only allow a constant with no relocations (static or dynamic) to be placed
in a mergeable section.
This will allow unused symbols that contain static relocations and happen to
fit in mergeable constant sections (.rodata.cstN) to instead be placed in
unique-named sections if -fdata-sections is used and subsequently garbage collected
by --gc-sections.
See https://lists.llvm.org/pipermail/llvm-dev/2021-February/148281.html.
Differential Revision: https://reviews.llvm.org/D95960
With CSSPGO all indirect call targets are counted torwards the original indirect call site in the profile, including both inlined and non-inlined targets. Therefore no need to look for callee entry counts. This also fixes the issue where callee entry count doesn't match callsite count due to the nature of CS sampling.
I'm also cleaning up the orginal code that called `findIndirectCallFunctionSamples` just to compute the sum, the return value of which was disgarded.
Reviewed By: wmi, wenlei
Differential Revision: https://reviews.llvm.org/D96990
This is a somewhat reduced testcase that regressed, causing the revert
in 477e3fe4f8.
This was producing a bundle that could not be allocated. This is a
tricky one to reduce/reproduce, but I do like having some sanity check
for this.
Extracts the relevant dimensions from the map under test to build up the
maps to test against in a permutation-invariant way.
This also includes a fix to the indexing maps used by
isColumnMajorMatmul. The maps as currently written do not describe a
column-major matmul. The linalg named op column_major_matmul has the
correct maps (and notably fails the current test).
If `C = matmul(A, B)` we want an operation that given A in column major
format and B in column major format produces C in column major format.
Given that for a matrix, faux column major is just transpose.
`column_major_matmul(transpose(A), transpose(B)) = transpose(C)`. If
`A` is `NxK` and `B` is `KxM`, then `C` is `NxM`, so `transpose(A)` is
`KxN`, `transpose(B)` is `MxK` and `transpose(C)` is `MxN`, not `NxM`
as these maps currently have.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96984
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
AMDGPU currently has a lot of pre-processing code to pre-split
argument types into 32-bit pieces before passing it to the generic
code in handleAssignments. This is a bit sloppy and also requires some
overly fancy iterator work when building the calls. It's better if all
argument marshalling code is handled directly in
handleAssignments. This handles more situations like decomposing large
element vectors into sub-element sized pieces.
This should mostly be NFC, but does change the generated code by
shifting where the initial argument packing instructions are placed. I
think this is nicer looking, since it now emits the packing code
directly after the relevant copies, rather than after the copies for
the remaining arguments.
This doubles down on gfx6/gfx7 using the gfx8+ ABI for 16-bit
types. This is ultimately the better option, but incompatible with the
DAG. Fixing this requires more work, especially for f16.
We can always look through single-argument (LCSSA) phi nodes when
performing alias analysis. getUnderlyingObject() already does this,
but stripPointerCastsAndInvariantGroups() does not. We still look
through these phi nodes with the usual aliasPhi() logic, but
sometimes get sub-optimal results due to the restrictions on value
equivalence when looking through arbitrary phi nodes. I think it's
generally beneficial to keep the underlying object logic and the
pointer cast stripping logic in sync, insofar as it is possible.
With this patch we get marginally better results:
aa.NumMayAlias | 5010069 | 5009861
aa.NumMustAlias | 347518 | 347674
aa.NumNoAlias | 27201336 | 27201528
...
licm.NumPromoted | 1293 | 1296
I've renamed the relevant strip method to stripPointerCastsForAliasAnalysis(),
as we're past the point where we can explicitly spell out everything
that's getting stripped.
Differential Revision: https://reviews.llvm.org/D96668
Static subtensor / subtensor_insert of the same size as the source / destination tensor and root @[0..0] with strides [1..1] are folded away.
Differential revision: https://reviews.llvm.org/D96991
Namely, implementations of fegetexceptfflag, fesetexceptflag,
fegetenv, fesetenv, feholdexcept and feupdateenv have been added.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D96935
If extload is legal, following transform
(zext (select c, load1, load2)) -> (select c, zextload1, zextload2)
can save one ext instruction.
Differential Revision: https://reviews.llvm.org/D95086
Most Fortran compilers accept the following benign extension,
and it appears in some applications:
SUBROUTINE FOO(A,N)
IMPLICIT NONE
REAL A(N) ! N is used before being typed
INTEGER N
END
Allow it in f18 only for default integer scalar dummy arguments.
Differential Revesion: https://reviews.llvm.org/D96982
Differential Revision: https://reviews.llvm.org/D95913
Usage: -bundle_loader <executable>
This option specifies the executable that will load the build output file being linked.
When building a bundle, users can use the --bundle_loader to specify an executable
that contains symbols referenced, but not implemented in the bundle.
It's not necessarily the case on all architectures that all memory is
addressable in addrspace 0, so casting the pointer to addrspace 0 is
liable to cause problems.
Reviewed By: aartbik, ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96380
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
This patch adds lowering to Linalg for the following TOSA ops: negate, rsqrt, mul, select, clamp and reluN and includes support for signless integer and floating point types
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D96924
This patch adds functionality to compare for the equality between `InterfaceFile`s based on attributes specific to linking.
Reviewed By: cishida, steven_wu
Differential Revision: https://reviews.llvm.org/D96629
Fixes assert in: https://bugs.llvm.org/show_bug.cgi?id=49036
getWasmSection creates sections if they don't exist, but doesn't add them to the Symbols table. This may cause problems in subsequent calls to getOrCreateSymbol which checks this table, the calls createSymbol assuming it doesn't exist, which then checks UsedNames and finds out it does exist, causing an assert on trying to rename a non-temp symbol.
I tried also fixing the somewhat unintuitive forced suffixing (adding `0`), but it turns out that WasmObjectWriter currently assumes these section symbols are unique, so that may have to be a separate fix: https://bugs.llvm.org/show_bug.cgi?id=49252
Also worth noting is that getWasmSection calling createSymbol may not be correct to start with, given that createSymbol seems to assume it is creating non-section symbols. But again, for a future fix.
Related: where some of this was introduced: 8d396acac3
Differential Revision: https://reviews.llvm.org/D96473
This avoids tedious repetition and matches what we do for the
ValueTypeByHwMode uses.
Reviewed By: craig.topper, luismarques
Differential Revision: https://reviews.llvm.org/D96649