2022-04-07 18:03:55 +08:00
|
|
|
// RUN: %clang_cc1 -no-opaque-pointers -triple x86_64-linux-gnu -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,LINUX
|
|
|
|
// RUN: %clang_cc1 -no-opaque-pointers -triple x86_64-windows-pc -fms-compatibility -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WINDOWS
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
#ifdef _WIN64
|
|
|
|
#define ATTR(X) __declspec(X)
|
|
|
|
#else
|
|
|
|
#define ATTR(X) __attribute__((X))
|
|
|
|
#endif // _MSC_VER
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// Each version should have an IFunc and an alias.
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
// LINUX: @SingleVersion = weak_odr alias void (), void ()* @SingleVersion.ifunc
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: @TwoVersions = weak_odr alias void (), void ()* @TwoVersions.ifunc
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
// LINUX: @OrderDispatchUsageSpecific = weak_odr alias void (), void ()* @OrderDispatchUsageSpecific.ifunc
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: @TwoVersionsSameAttr = weak_odr alias void (), void ()* @TwoVersionsSameAttr.ifunc
|
|
|
|
// LINUX: @ThreeVersionsSameAttr = weak_odr alias void (), void ()* @ThreeVersionsSameAttr.ifunc
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
// LINUX: @OrderSpecificUsageDispatch = weak_odr alias void (), void ()* @OrderSpecificUsageDispatch.ifunc
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: @NoSpecifics = weak_odr alias void (), void ()* @NoSpecifics.ifunc
|
|
|
|
// LINUX: @HasGeneric = weak_odr alias void (), void ()* @HasGeneric.ifunc
|
|
|
|
// LINUX: @HasParams = weak_odr alias void (i32, double), void (i32, double)* @HasParams.ifunc
|
|
|
|
// LINUX: @HasParamsAndReturn = weak_odr alias i32 (i32, double), i32 (i32, double)* @HasParamsAndReturn.ifunc
|
|
|
|
// LINUX: @GenericAndPentium = weak_odr alias i32 (i32, double), i32 (i32, double)* @GenericAndPentium.ifunc
|
|
|
|
// LINUX: @DispatchFirst = weak_odr alias i32 (), i32 ()* @DispatchFirst.ifunc
|
|
|
|
|
|
|
|
// LINUX: @SingleVersion.ifunc = weak_odr ifunc void (), void ()* ()* @SingleVersion.resolver
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
// LINUX: @TwoVersions.ifunc = weak_odr ifunc void (), void ()* ()* @TwoVersions.resolver
|
|
|
|
// LINUX: @OrderDispatchUsageSpecific.ifunc = weak_odr ifunc void (), void ()* ()* @OrderDispatchUsageSpecific.resolver
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: @TwoVersionsSameAttr.ifunc = weak_odr ifunc void (), void ()* ()* @TwoVersionsSameAttr.resolver
|
|
|
|
// LINUX: @ThreeVersionsSameAttr.ifunc = weak_odr ifunc void (), void ()* ()* @ThreeVersionsSameAttr.resolver
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
// LINUX: @OrderSpecificUsageDispatch.ifunc = weak_odr ifunc void (), void ()* ()* @OrderSpecificUsageDispatch.resolver
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: @NoSpecifics.ifunc = weak_odr ifunc void (), void ()* ()* @NoSpecifics.resolver
|
|
|
|
// LINUX: @HasGeneric.ifunc = weak_odr ifunc void (), void ()* ()* @HasGeneric.resolver
|
|
|
|
// LINUX: @HasParams.ifunc = weak_odr ifunc void (i32, double), void (i32, double)* ()* @HasParams.resolver
|
|
|
|
// LINUX: @HasParamsAndReturn.ifunc = weak_odr ifunc i32 (i32, double), i32 (i32, double)* ()* @HasParamsAndReturn.resolver
|
|
|
|
// LINUX: @GenericAndPentium.ifunc = weak_odr ifunc i32 (i32, double), i32 (i32, double)* ()* @GenericAndPentium.resolver
|
|
|
|
// LINUX: @DispatchFirst.ifunc = weak_odr ifunc i32 (), i32 ()* ()* @DispatchFirst.resolver
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_specific(ivybridge))
|
2018-07-20 22:13:28 +08:00
|
|
|
void SingleVersion(void){}
|
2020-12-31 16:27:11 +08:00
|
|
|
// LINUX: define{{.*}} void @SingleVersion.S() #[[S:[0-9]+]]
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: define dso_local void @SingleVersion.S() #[[S:[0-9]+]]
|
2018-07-20 22:13:28 +08:00
|
|
|
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
ATTR(cpu_dispatch(ivybridge))
|
|
|
|
void SingleVersion(void);
|
|
|
|
// LINUX: define weak_odr void ()* @SingleVersion.resolver()
|
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void ()* @SingleVersion.S
|
|
|
|
// LINUX: call void @llvm.trap
|
|
|
|
// LINUX: unreachable
|
|
|
|
|
|
|
|
// WINDOWS: define weak_odr dso_local void @SingleVersion() comdat
|
|
|
|
// WINDOWS: call void @__cpu_indicator_init()
|
|
|
|
// WINDOWS: call void @SingleVersion.S()
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @llvm.trap
|
|
|
|
// WINDOWS: unreachable
|
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_specific(ivybridge))
|
2018-07-20 22:13:28 +08:00
|
|
|
void NotCalled(void){}
|
2020-12-31 16:27:11 +08:00
|
|
|
// LINUX: define{{.*}} void @NotCalled.S() #[[S]]
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: define dso_local void @NotCalled.S() #[[S:[0-9]+]]
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-11-29 05:54:04 +08:00
|
|
|
// Done before any of the implementations. Also has an undecorated forward
|
|
|
|
// declaration.
|
|
|
|
void TwoVersions(void);
|
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_dispatch(ivybridge, knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void TwoVersions(void);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr void ()* @TwoVersions.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void ()* @TwoVersions.Z
|
|
|
|
// LINUX: ret void ()* @TwoVersions.S
|
|
|
|
// LINUX: call void @llvm.trap
|
|
|
|
// LINUX: unreachable
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local void @TwoVersions() comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init()
|
|
|
|
// WINDOWS: call void @TwoVersions.Z()
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @TwoVersions.S()
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @llvm.trap
|
|
|
|
// WINDOWS: unreachable
|
|
|
|
|
|
|
|
ATTR(cpu_specific(ivybridge))
|
2018-07-20 22:13:28 +08:00
|
|
|
void TwoVersions(void){}
|
2018-10-26 02:57:19 +08:00
|
|
|
// CHECK: define {{.*}}void @TwoVersions.S() #[[S]]
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_specific(knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void TwoVersions(void){}
|
2018-10-26 02:57:19 +08:00
|
|
|
// CHECK: define {{.*}}void @TwoVersions.Z() #[[K:[0-9]+]]
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_specific(ivybridge, knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void TwoVersionsSameAttr(void){}
|
2018-10-26 02:57:19 +08:00
|
|
|
// CHECK: define {{.*}}void @TwoVersionsSameAttr.S() #[[S]]
|
|
|
|
// CHECK: define {{.*}}void @TwoVersionsSameAttr.Z() #[[K]]
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_specific(atom, ivybridge, knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void ThreeVersionsSameAttr(void){}
|
2018-10-26 02:57:19 +08:00
|
|
|
// CHECK: define {{.*}}void @ThreeVersionsSameAttr.O() #[[O:[0-9]+]]
|
|
|
|
// CHECK: define {{.*}}void @ThreeVersionsSameAttr.S() #[[S]]
|
|
|
|
// CHECK: define {{.*}}void @ThreeVersionsSameAttr.Z() #[[K]]
|
2018-07-20 22:13:28 +08:00
|
|
|
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
ATTR(cpu_specific(knl))
|
|
|
|
void CpuSpecificNoDispatch(void) {}
|
|
|
|
// CHECK: define {{.*}}void @CpuSpecificNoDispatch.Z() #[[K:[0-9]+]]
|
|
|
|
|
|
|
|
ATTR(cpu_dispatch(knl))
|
|
|
|
void OrderDispatchUsageSpecific(void);
|
|
|
|
// LINUX: define weak_odr void ()* @OrderDispatchUsageSpecific.resolver()
|
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void ()* @OrderDispatchUsageSpecific.Z
|
|
|
|
// LINUX: call void @llvm.trap
|
|
|
|
// LINUX: unreachable
|
|
|
|
|
|
|
|
// WINDOWS: define weak_odr dso_local void @OrderDispatchUsageSpecific() comdat
|
|
|
|
// WINDOWS: call void @__cpu_indicator_init()
|
|
|
|
// WINDOWS: call void @OrderDispatchUsageSpecific.Z()
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @llvm.trap
|
|
|
|
// WINDOWS: unreachable
|
|
|
|
|
|
|
|
// CHECK: define {{.*}}void @OrderDispatchUsageSpecific.Z()
|
|
|
|
|
|
|
|
ATTR(cpu_specific(knl))
|
|
|
|
void OrderSpecificUsageDispatch(void) {}
|
|
|
|
// CHECK: define {{.*}}void @OrderSpecificUsageDispatch.Z() #[[K:[0-9]+]]
|
|
|
|
|
2022-02-15 22:27:12 +08:00
|
|
|
void usages(void) {
|
2018-07-20 22:13:28 +08:00
|
|
|
SingleVersion();
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: @SingleVersion.ifunc()
|
|
|
|
// WINDOWS: @SingleVersion()
|
2018-07-20 22:13:28 +08:00
|
|
|
TwoVersions();
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: @TwoVersions.ifunc()
|
|
|
|
// WINDOWS: @TwoVersions()
|
2018-07-20 22:13:28 +08:00
|
|
|
TwoVersionsSameAttr();
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: @TwoVersionsSameAttr.ifunc()
|
|
|
|
// WINDOWS: @TwoVersionsSameAttr()
|
2018-07-20 22:13:28 +08:00
|
|
|
ThreeVersionsSameAttr();
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: @ThreeVersionsSameAttr.ifunc()
|
|
|
|
// WINDOWS: @ThreeVersionsSameAttr()
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
CpuSpecificNoDispatch();
|
|
|
|
// LINUX: @CpuSpecificNoDispatch.ifunc()
|
|
|
|
// WINDOWS: @CpuSpecificNoDispatch()
|
|
|
|
OrderDispatchUsageSpecific();
|
|
|
|
// LINUX: @OrderDispatchUsageSpecific.ifunc()
|
|
|
|
// WINDOWS: @OrderDispatchUsageSpecific()
|
|
|
|
OrderSpecificUsageDispatch();
|
|
|
|
// LINUX: @OrderSpecificUsageDispatch.ifunc()
|
|
|
|
// WINDOWS: @OrderSpecificUsageDispatch()
|
2018-07-20 22:13:28 +08:00
|
|
|
}
|
|
|
|
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
// LINUX: declare void @CpuSpecificNoDispatch.ifunc()
|
|
|
|
|
2018-07-20 22:13:28 +08:00
|
|
|
// has an extra config to emit!
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_dispatch(ivybridge, knl, atom))
|
2018-07-20 22:13:28 +08:00
|
|
|
void TwoVersionsSameAttr(void);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr void ()* @TwoVersionsSameAttr.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: ret void ()* @TwoVersionsSameAttr.Z
|
|
|
|
// LINUX: ret void ()* @TwoVersionsSameAttr.S
|
|
|
|
// LINUX: ret void ()* @TwoVersionsSameAttr.O
|
|
|
|
// LINUX: call void @llvm.trap
|
|
|
|
// LINUX: unreachable
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local void @TwoVersionsSameAttr() comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @TwoVersionsSameAttr.Z
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @TwoVersionsSameAttr.S
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @TwoVersionsSameAttr.O
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @llvm.trap
|
|
|
|
// WINDOWS: unreachable
|
|
|
|
|
|
|
|
ATTR(cpu_dispatch(atom, ivybridge, knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void ThreeVersionsSameAttr(void){}
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr void ()* @ThreeVersionsSameAttr.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void ()* @ThreeVersionsSameAttr.Z
|
|
|
|
// LINUX: ret void ()* @ThreeVersionsSameAttr.S
|
|
|
|
// LINUX: ret void ()* @ThreeVersionsSameAttr.O
|
|
|
|
// LINUX: call void @llvm.trap
|
|
|
|
// LINUX: unreachable
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local void @ThreeVersionsSameAttr() comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: call void @ThreeVersionsSameAttr.Z
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @ThreeVersionsSameAttr.S
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @ThreeVersionsSameAttr.O
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @llvm.trap
|
|
|
|
// WINDOWS: unreachable
|
2018-07-20 22:13:28 +08:00
|
|
|
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
ATTR(cpu_dispatch(knl))
|
|
|
|
void OrderSpecificUsageDispatch(void);
|
|
|
|
// LINUX: define weak_odr void ()* @OrderSpecificUsageDispatch.resolver()
|
|
|
|
// LINUX: ret void ()* @OrderSpecificUsageDispatch.Z
|
|
|
|
|
|
|
|
// WINDOWS: define weak_odr dso_local void @OrderSpecificUsageDispatch() comdat
|
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: call void @OrderSpecificUsageDispatch.Z
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
|
2018-07-20 22:13:28 +08:00
|
|
|
// No Cpu Specific options.
|
2018-10-26 02:57:19 +08:00
|
|
|
ATTR(cpu_dispatch(atom, ivybridge, knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void NoSpecifics(void);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr void ()* @NoSpecifics.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void ()* @NoSpecifics.Z
|
|
|
|
// LINUX: ret void ()* @NoSpecifics.S
|
|
|
|
// LINUX: ret void ()* @NoSpecifics.O
|
|
|
|
// LINUX: call void @llvm.trap
|
|
|
|
// LINUX: unreachable
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local void @NoSpecifics() comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: call void @NoSpecifics.Z
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @NoSpecifics.S
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @NoSpecifics.O
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @llvm.trap
|
|
|
|
// WINDOWS: unreachable
|
|
|
|
|
|
|
|
ATTR(cpu_dispatch(atom, generic, ivybridge, knl))
|
2018-07-20 22:13:28 +08:00
|
|
|
void HasGeneric(void);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr void ()* @HasGeneric.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void ()* @HasGeneric.Z
|
|
|
|
// LINUX: ret void ()* @HasGeneric.S
|
|
|
|
// LINUX: ret void ()* @HasGeneric.O
|
|
|
|
// LINUX: ret void ()* @HasGeneric.A
|
|
|
|
// LINUX-NOT: call void @llvm.trap
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local void @HasGeneric() comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: call void @HasGeneric.Z
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @HasGeneric.S
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @HasGeneric.O
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @HasGeneric.A
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS-NOT: call void @llvm.trap
|
|
|
|
|
|
|
|
ATTR(cpu_dispatch(atom, generic, ivybridge, knl))
|
|
|
|
void HasParams(int i, double d);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr void (i32, double)* @HasParams.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret void (i32, double)* @HasParams.Z
|
|
|
|
// LINUX: ret void (i32, double)* @HasParams.S
|
|
|
|
// LINUX: ret void (i32, double)* @HasParams.O
|
|
|
|
// LINUX: ret void (i32, double)* @HasParams.A
|
|
|
|
// LINUX-NOT: call void @llvm.trap
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local void @HasParams(i32 %0, double %1) comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: call void @HasParams.Z(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @HasParams.S(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @HasParams.O(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS: call void @HasParams.A(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret void
|
|
|
|
// WINDOWS-NOT: call void @llvm.trap
|
|
|
|
|
|
|
|
ATTR(cpu_dispatch(atom, generic, ivybridge, knl))
|
|
|
|
int HasParamsAndReturn(int i, double d);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr i32 (i32, double)* @HasParamsAndReturn.resolver()
|
2018-10-26 02:57:19 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret i32 (i32, double)* @HasParamsAndReturn.Z
|
|
|
|
// LINUX: ret i32 (i32, double)* @HasParamsAndReturn.S
|
|
|
|
// LINUX: ret i32 (i32, double)* @HasParamsAndReturn.O
|
|
|
|
// LINUX: ret i32 (i32, double)* @HasParamsAndReturn.A
|
|
|
|
// LINUX-NOT: call void @llvm.trap
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local i32 @HasParamsAndReturn(i32 %0, double %1) comdat
|
2018-10-26 02:57:19 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @HasParamsAndReturn.Z(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @HasParamsAndReturn.S(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @HasParamsAndReturn.O(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @HasParamsAndReturn.A(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS-NOT: call void @llvm.trap
|
2018-07-20 22:13:28 +08:00
|
|
|
|
2018-11-01 20:50:37 +08:00
|
|
|
ATTR(cpu_dispatch(atom, generic, pentium))
|
|
|
|
int GenericAndPentium(int i, double d);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr i32 (i32, double)* @GenericAndPentium.resolver()
|
2018-11-01 20:50:37 +08:00
|
|
|
// LINUX: call void @__cpu_indicator_init
|
|
|
|
// LINUX: ret i32 (i32, double)* @GenericAndPentium.O
|
|
|
|
// LINUX: ret i32 (i32, double)* @GenericAndPentium.B
|
|
|
|
// LINUX-NOT: ret i32 (i32, double)* @GenericAndPentium.A
|
|
|
|
// LINUX-NOT: call void @llvm.trap
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local i32 @GenericAndPentium(i32 %0, double %1) comdat
|
2018-11-01 20:50:37 +08:00
|
|
|
// WINDOWS: call void @__cpu_indicator_init
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @GenericAndPentium.O(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @GenericAndPentium.B(i32 %0, double %1)
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS-NOT: call i32 @GenericAndPentium.A
|
|
|
|
// WINDOWS-NOT: call void @llvm.trap
|
|
|
|
|
2018-11-13 23:48:08 +08:00
|
|
|
ATTR(cpu_dispatch(atom, pentium))
|
|
|
|
int DispatchFirst(void);
|
2019-09-11 09:54:48 +08:00
|
|
|
// LINUX: define weak_odr i32 ()* @DispatchFirst.resolver
|
2018-11-13 23:48:08 +08:00
|
|
|
// LINUX: ret i32 ()* @DispatchFirst.O
|
|
|
|
// LINUX: ret i32 ()* @DispatchFirst.B
|
|
|
|
|
2019-09-11 09:54:48 +08:00
|
|
|
// WINDOWS: define weak_odr dso_local i32 @DispatchFirst() comdat
|
2018-11-13 23:48:08 +08:00
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @DispatchFirst.O()
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
// WINDOWS: %[[RET:.+]] = musttail call i32 @DispatchFirst.B()
|
|
|
|
// WINDOWS-NEXT: ret i32 %[[RET]]
|
|
|
|
|
|
|
|
ATTR(cpu_specific(atom))
|
|
|
|
int DispatchFirst(void) {return 0;}
|
2020-12-31 16:27:11 +08:00
|
|
|
// LINUX: define{{.*}} i32 @DispatchFirst.O
|
2018-11-13 23:48:08 +08:00
|
|
|
// LINUX: ret i32 0
|
|
|
|
|
|
|
|
// WINDOWS: define dso_local i32 @DispatchFirst.O()
|
|
|
|
// WINDOWS: ret i32 0
|
|
|
|
|
|
|
|
ATTR(cpu_specific(pentium))
|
|
|
|
int DispatchFirst(void) {return 1;}
|
2020-12-31 16:27:11 +08:00
|
|
|
// LINUX: define{{.*}} i32 @DispatchFirst.B
|
2018-11-13 23:48:08 +08:00
|
|
|
// LINUX: ret i32 1
|
|
|
|
|
|
|
|
// WINDOWS: define dso_local i32 @DispatchFirst.B
|
|
|
|
// WINDOWS: ret i32 1
|
|
|
|
|
[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:
```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }
// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);
// run:
clang main.c other.c -o main; ./main
```
This will segfault prior to the change, and return the correct
exit code 5 after the change.
The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.
The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.
I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.
Previous discussion in: https://reviews.llvm.org/D112349
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D120266
2022-01-29 20:32:54 +08:00
|
|
|
ATTR(cpu_specific(knl))
|
|
|
|
void OrderDispatchUsageSpecific(void) {}
|
|
|
|
|
2021-09-06 13:55:17 +08:00
|
|
|
// CHECK: attributes #[[S]] = {{.*}}"target-features"="+avx,+cmov,+crc32,+cx8,+f16c,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
|
Have cpu-specific variants set 'tune-cpu' as an optimization hint
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.
This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.
Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).
If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.
1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.
2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.
Differential Revision: https://reviews.llvm.org/D121410
2022-03-11 05:31:52 +08:00
|
|
|
// CHECK-SAME: "tune-cpu"="ivybridge"
|
2021-09-06 13:55:17 +08:00
|
|
|
// CHECK: attributes #[[K]] = {{.*}}"target-features"="+adx,+avx,+avx2,+avx512cd,+avx512er,+avx512f,+avx512pf,+bmi,+cmov,+crc32,+cx8,+f16c,+fma,+lzcnt,+mmx,+movbe,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
|
Have cpu-specific variants set 'tune-cpu' as an optimization hint
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.
This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.
Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).
If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.
1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.
2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.
Differential Revision: https://reviews.llvm.org/D121410
2022-03-11 05:31:52 +08:00
|
|
|
// CHECK-SAME: "tune-cpu"="knl"
|
2019-03-22 04:36:08 +08:00
|
|
|
// CHECK: attributes #[[O]] = {{.*}}"target-features"="+cmov,+cx8,+mmx,+movbe,+sse,+sse2,+sse3,+ssse3,+x87"
|
Have cpu-specific variants set 'tune-cpu' as an optimization hint
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.
This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.
Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).
If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.
1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.
2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.
Differential Revision: https://reviews.llvm.org/D121410
2022-03-11 05:31:52 +08:00
|
|
|
// CHECK-SAME: "tune-cpu"="atom"
|