This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.
This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.
One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.
I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.
Differential Revision: https://reviews.llvm.org/D85165
This allows targets to know exactly which operands are contributing to
the dependency, which is required for targets with per-operand
scheduling models.
Differential Revision: https://reviews.llvm.org/D77135
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.
It may cause existing bitcode files to become invalid.
* Converting between vector predicates and vector registers must be
done explicitly via vandvrt/vandqrt instructions (their intrinsics),
i.e. (for 64-byte mode):
%Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
%V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)
The conversion intrinsics are:
declare <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
They are all pure.
* Vector predicate values cannot be loaded/stored directly. This directly
reflects the architecture restriction. Loading and storing or vector
predicates must be done indirectly via vector registers and explicit
conversions via vandvrt/vandqrt instructions.
Differential revision: https://reviews.llvm.org/D72701
The patch adds a new option ABI for Hexagon. It primary deals with
the way variable arguments are passed and is use in the Hexagon Linux Musl
environment.
If a callee function has a variable argument list, it must perform the
following operations to set up its function prologue:
1. Determine the number of registers which could have been used for passing
unnamed arguments. This can be calculated by counting the number of
registers used for passing named arguments. For example, if the callee
function is as follows:
int foo(int a, ...){ ... }
... then register R0 is used to access the argument ' a '. The registers
available for passing unnamed arguments are R1, R2, R3, R4, and R5.
2. Determine the number and size of the named arguments on the stack.
3. If the callee has named arguments on the stack, it should copy all of these
arguments to a location below the current position on the stack, and the
difference should be the size of the register-saved area plus padding
(if any is necessary).
The register-saved area constitutes all the registers that could have
been used to pass unnamed arguments. If the number of registers forming
the register-saved area is odd, it requires 4 bytes of padding; if the
number is even, no padding is required. This is done to ensure an 8-byte
alignment on the stack. For example, if the callee is as follows:
int foo(int a, ...){ ... }
... then the named arguments should be copied to the following location:
current_position - 5 (for R1-R5) * 4 (bytes) - 4 (bytes of padding)
If the callee is as follows:
int foo(int a, int b, ...){ ... }
... then the named arguments should be copied to the following location:
current_position - 4 (for R2-R5) * 4 (bytes) - 0 (bytes of padding)
4. After any named arguments have been copied, copy all the registers that
could have been used to pass unnamed arguments on the stack. If the number
of registers is odd, leave 4 bytes of padding and then start copying them
on the stack; if the number is even, no padding is required. This
constitutes the register-saved area. If padding is required, ensure
that the start location of padding is 8-byte aligned. If no padding is
required, ensure that the start location of the on-stack copy of the
first register which might have a variable argument is 8-byte aligned.
5. Decrement the stack pointer by the size of register saved area plus the
padding. For example, if the callee is as follows:
int foo(int a, ...){ ... } ;
... then the decrement value should be the following:
5 (for R1-R5) * 4 (bytes) + 4 (bytes of padding) = 24 bytes
The decrement should be performed before the allocframe instruction.
Increment the stack-pointer back by the same amount before returning
from the function.
There was a change to trap1 instruction between v62 and v65. This
feature will allow the assembler/disassembler to handle different
variants depending on the CPU version.
* Reordered MVT simple types to group scalable vector types
together.
* New range functions in MachineValueType.h to only iterate over
the fixed-length int/fp vector types.
* Stopped backends which don't support scalable vector types from
iterating over scalable types.
Reviewers: sdesmalen, greened
Reviewed By: greened
Differential Revision: https://reviews.llvm.org/D66339
llvm-svn: 372099
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Eliminate the stack frame in functions with the noreturn nounwind
attributes, and when the noreturn-stack-elim target feature is
enabled. This reduces the code and stack space needed for noreturn
functions.
Differential Revision: https://reviews.llvm.org/D54210
llvm-svn: 346532
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
This adds two features: "packets", and "nvj".
Enabling "packets" allows the compiler to generate instruction packets,
while disabling it will prevent it and disable all optimizations that
generate them. This feature is enabled by default on all subtargets.
The feature "nvj" allows the compiler to generate new-value jumps and it
implies "packets". It is enabled on all subtargets.
The exception is made for packets with endloop instructions, since they
require a certain minimum number of instructions in the packets to which
they apply. Disabling "packets" will not prevent hardware loops from
being generated.
llvm-svn: 327302
This is a follow-up to r325169, this time for all types, not just HVX
vector types.
Disable this by default, since it's not always safe.
llvm-svn: 326915
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490
This patch lets the llvm tools handle the new HVX target features that
are added by frontend (clang). The target-features are of the form
"hvx-length64b" for 64 Byte HVX mode, "hvx-length128b" for 128 Byte mode HVX.
"hvx-double" is an alias to "hvx-length128b" and is soon will be deprecated.
The hvx version target feature is upgated form "+hvx" to "+hvxv{version_number}.
Eg: "+hvxv62"
For the correct HVX code generation, the user must use the following
target features.
For 64B mode: "+hvxv62" "+hvx-length64b"
For 128B mode: "+hvxv62" "+hvx-length128b"
Clang picks a default length if none is specified. If for some reason,
no hvx-length is specified to llvm, the compilation will bail out.
There is a corresponding clang patch.
Differential Revision: https://reviews.llvm.org/D38851
llvm-svn: 316101
This removes the duplicate HVX instruction set for the 128-byte mode.
Single instruction set now works for both modes (64- and 128-byte).
llvm-svn: 313362
An instruction may have multiple predecessors that are candidates
for using .cur. However, only one of them can use .cur in the
packet. When this case occurs, we need to make sure that only
one of the dependences gets a 0 latency value.
Patch by Brendon Cahoon.
llvm-svn: 275790