llvm-project/compiler-rt/lib/builtins
Roman Lebedev e8e95b5b01 [compiler-rt][X86][AMD][Bulldozer] Fix Bulldozer Model 2 detection.
Summary:
The compiler-rt side of D46314

I have discovered an issue by accident.
```
$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  2
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Vendor ID:           AuthenticAMD
CPU family:          21
Model:               2
Model name:          AMD FX(tm)-8350 Eight-Core Processor
Stepping:            0
CPU MHz:             3584.018
CPU max MHz:         4000.0000
CPU min MHz:         1400.0000
BogoMIPS:            8027.22
Virtualization:      AMD-V
L1d cache:           16K
L1i cache:           64K
L2 cache:            2048K
L3 cache:            8192K
NUMA node0 CPU(s):   0-7
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
```
So this is model-2 bulldozer AMD CPU.

GCC agrees:
```
$ echo | gcc -E - -march=native -###
<...>
 /usr/lib/gcc/x86_64-linux-gnu/7/cc1 -E -quiet -imultiarch x86_64-linux-gnu - "-march=bdver2" -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mno-movbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mno-sgx -mno-bmi2 -mtbm -mavx -mno-avx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mf16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mxsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mno-mwaitx -mno-clzero -mno-pku -mno-rdpid --param "l1-cache-size=16" --param "l1-cache-line-size=64" --param "l2-cache-size=2048" "-mtune=bdver2"
<...>
```

But clang does not: (look for `bdver1`)
```
$ echo | clang -E - -march=native -###
clang version 7.0.0- (trunk)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
 "/usr/lib/llvm-7/bin/clang" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-E" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "-" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "bdver1" "-target-feature" "+sse2" "-target-feature" "+cx16" "-target-feature" "+sahf" "-target-feature" "+tbm" "-target-feature" "-avx512ifma" "-target-feature" "-sha" "-target-feature" "-gfni" "-target-feature" "+fma4" "-target-feature" "-vpclmulqdq" "-target-feature" "+prfchw" "-target-feature" "-bmi2" "-target-feature" "-cldemote" "-target-feature" "-fsgsbase" "-target-feature" "-xsavec" "-target-feature" "+popcnt" "-target-feature" "+aes" "-target-feature" "-avx512bitalg" "-target-feature" "-xsaves" "-target-feature" "-avx512er" "-target-feature" "-avx512vnni" "-target-feature" "-avx512vpopcntdq" "-target-feature" "-clwb" "-target-feature" "-avx512f" "-target-feature" "-clzero" "-target-feature" "-pku" "-target-feature" "+mmx" "-target-feature" "+lwp" "-target-feature" "-rdpid" "-target-feature" "+xop" "-target-feature" "-rdseed" "-target-feature" "-waitpkg" "-target-feature" "-ibt" "-target-feature" "+sse4a" "-target-feature" "-avx512bw" "-target-feature" "-clflushopt" "-target-feature" "+xsave" "-target-feature" "-avx512vbmi2" "-target-feature" "-avx512vl" "-target-feature" "-avx512cd" "-target-feature" "+avx" "-target-feature" "-vaes" "-target-feature" "-rtm" "-target-feature" "+fma" "-target-feature" "+bmi" "-target-feature" "-rdrnd" "-target-feature" "-mwaitx" "-target-feature" "+sse4.1" "-target-feature" "+sse4.2" "-target-feature" "-avx2" "-target-feature" "-wbnoinvd" "-target-feature" "+sse" "-target-feature" "+lzcnt" "-target-feature" "+pclmul" "-target-feature" "-prefetchwt1" "-target-feature" "+f16c" "-target-feature" "+ssse3" "-target-feature" "-sgx" "-target-feature" "-shstk" "-target-feature" "+cmov" "-target-feature" "-avx512vbmi" "-target-feature" "-movbe" "-target-feature" "-xsaveopt" "-target-feature" "-avx512dq" "-target-feature" "-adx" "-target-feature" "-avx512pf" "-target-feature" "+sse3" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/usr/lib/llvm-7/lib/clang/7.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/lib/llvm-7/lib/clang/7.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir" "/build/llvm-build-Clang-release" "-ferror-limit" "19" "-fmessage-length" "271" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "-" "-x" "c" "-"
```

So clang, unlike gcc, considers this to be `bdver1`.

After some digging, i've come across `getAMDProcessorTypeAndSubtype()` in `Host.cpp`.
I have added the following debug printf after the call to that function in `sys::getHostCPUName()`:
```
errs() << "Family " << Family << " Model " << Model << " Type " << Type "\n";
```
Which produced:
```
Family 21 Model 2 Type 5
```
Which matches the `lscpu` output.

As it was pointed in the review by @craig.topper:
>>! In D46314#1084123, @craig.topper wrote:
> I dont' think this is right. Here is what I found on wikipedia. https://en.wikipedia.org/wiki/List_of_AMD_CPU_microarchitectures.
>
> AMD Bulldozer Family 15h - the successor of 10h/K10. Bulldozer is designed for processors in the 10 to 220W category, implementing XOP, FMA4 and CVT16 instruction sets. Orochi was the first design which implemented it. For Bulldozer, CPUID model numbers are 00h and 01h.
> AMD Piledriver Family 15h (2nd-gen) - successor to Bulldozer. CPUID model numbers are 02h (earliest "Vishera" Piledrivers) and 10h-1Fh.
> AMD Steamroller Family 15h (3rd-gen) - third-generation Bulldozer derived core. CPUID model numbers are 30h-3Fh.
> AMD Excavator Family 15h (4th-gen) - fourth-generation Bulldozer derived core. CPUID model numbers are 60h-6Fh, later updated revisions have model numbers 70h-7Fh.
>
>
> So there's a weird exception where model 2 should go with 0x10-0x1f.

Though It does not help that the code can't be tested at the moment.
With this logical change, the `bdver2` is properly detected.
```
$ echo | /build/llvm-build-Clang-release/bin/clang -E - -march=native -###
clang version 7.0.0 (trunk 331249) (llvm/trunk 331256)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /build/llvm-build-Clang-release/bin
 "/build/llvm-build-Clang-release/bin/clang-7" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-E" "-disable-free" "-main-file-name" "-" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "bdver2" "-target-feature" "+sse2" "-target-feature" "+cx16" "-target-feature" "+sahf" "-target-feature" "+tbm" "-target-feature" "-avx512ifma" "-target-feature" "-sha" "-target-feature" "-gfni" "-target-feature" "+fma4" "-target-feature" "-vpclmulqdq" "-target-feature" "+prfchw" "-target-feature" "-bmi2" "-target-feature" "-cldemote" "-target-feature" "-fsgsbase" "-target-feature" "-xsavec" "-target-feature" "+popcnt" "-target-feature" "+aes" "-target-feature" "-avx512bitalg" "-target-feature" "-movdiri" "-target-feature" "-xsaves" "-target-feature" "-avx512er" "-target-feature" "-avx512vnni" "-target-feature" "-avx512vpopcntdq" "-target-feature" "-clwb" "-target-feature" "-avx512f" "-target-feature" "-clzero" "-target-feature" "-pku" "-target-feature" "+mmx" "-target-feature" "+lwp" "-target-feature" "-rdpid" "-target-feature" "+xop" "-target-feature" "-rdseed" "-target-feature" "-waitpkg" "-target-feature" "-movdir64b" "-target-feature" "-ibt" "-target-feature" "+sse4a" "-target-feature" "-avx512bw" "-target-feature" "-clflushopt" "-target-feature" "+xsave" "-target-feature" "-avx512vbmi2" "-target-feature" "-avx512vl" "-target-feature" "-avx512cd" "-target-feature" "+avx" "-target-feature" "-vaes" "-target-feature" "-rtm" "-target-feature" "+fma" "-target-feature" "+bmi" "-target-feature" "-rdrnd" "-target-feature" "-mwaitx" "-target-feature" "+sse4.1" "-target-feature" "+sse4.2" "-target-feature" "-avx2" "-target-feature" "-wbnoinvd" "-target-feature" "+sse" "-target-feature" "+lzcnt" "-target-feature" "+pclmul" "-target-feature" "-prefetchwt1" "-target-feature" "+f16c" "-target-feature" "+ssse3" "-target-feature" "-sgx" "-target-feature" "-shstk" "-target-feature" "+cmov" "-target-feature" "-avx512vbmi" "-target-feature" "-movbe" "-target-feature" "-xsaveopt" "-target-feature" "-avx512dq" "-target-feature" "-adx" "-target-feature" "-avx512pf" "-target-feature" "+sse3" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/build/llvm-build-Clang-release/lib/clang/7.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/build/llvm-build-Clang-release/lib/clang/7.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir" "/build/llvm-build-Clang-release" "-ferror-limit" "19" "-fmessage-length" "271" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "-" "-x" "c" "-"
```

Reviewers: craig.topper, asbirlea, rnk, GGanesh, andreadb

Reviewed By: craig.topper

Subscribers: sdardis, dberris, aprantl, arichardson, JDevlieghere, #sanitizers, llvm-commits, cfe-commits, craig.topper

Differential Revision: https://reviews.llvm.org/D46323

llvm-svn: 331295
2018-05-01 18:40:15 +00:00
..
Darwin-excludes [Darwin] [Builtins] Cleaning up OS X exclude lists. NFC. 2016-03-29 17:34:13 +00:00
aarch64 [builtins] Implement __chkstk for arm64 windows 2017-12-20 06:52:52 +00:00
arm [Builtins] Do not use tailcall for Thumb1 2017-11-09 17:32:57 +00:00
i386 Delete remaining compiler-rt makefiles 2016-08-23 17:32:38 +00:00
macho_embedded [CMake] [macho_embedded] [builtins] Need to also drop the bswap builtins. 2015-10-09 22:46:19 +00:00
ppc Delete remaining compiler-rt makefiles 2016-08-23 17:32:38 +00:00
riscv [PATCH] [compiler-rt, RISCV] Support builtins for RISC-V 2018-03-01 07:47:27 +00:00
x86_64 [builtins] Make some ISA macro checks work with MSVC 2017-04-07 17:18:43 +00:00
CMakeLists.txt [PATCH] [compiler-rt, RISCV] Support builtins for RISC-V 2018-03-01 07:47:27 +00:00
README.txt Add generic __bswap[ds]i2 implementations 2017-05-25 14:52:14 +00:00
absvdi2.c
absvsi2.c
absvti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
adddf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
addsf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
addtf3.c Provide add and sub for IEEE quad. From GuanHong Liu. 2014-06-19 20:24:49 +00:00
addvdi3.c PR21518: Use unsigned arithmetic for trapping add/sub functions. 2014-11-12 23:01:24 +00:00
addvsi3.c PR21518: Use unsigned arithmetic for trapping add/sub functions. 2014-11-12 23:01:24 +00:00
addvti3.c PR21518: Use unsigned arithmetic for trapping add/sub functions. 2014-11-12 23:01:24 +00:00
apple_versioning.c
ashldi3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
ashlti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
ashrdi3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
ashrti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
assembly.h [builtins] ARM: Reland fix for assembling builtins in thumb state. 2017-10-02 20:56:49 +00:00
atomic.c Atomics library: provide operations for __int128 when it is available. 2016-10-27 01:46:24 +00:00
atomic_flag_clear.c Code cleanup based on post-commit review from Aaron Ballman. 2015-09-23 15:28:35 +00:00
atomic_flag_clear_explicit.c Code cleanup based on post-commit review from Aaron Ballman. 2015-09-23 15:28:35 +00:00
atomic_flag_test_and_set.c [builtins] One more stab at fixing the bots. 2015-09-22 22:19:21 +00:00
atomic_flag_test_and_set_explicit.c [builtins] One more stab at fixing the bots. 2015-09-22 22:19:21 +00:00
atomic_signal_fence.c [builtins] One more stab at fixing the bots. 2015-09-22 22:19:21 +00:00
atomic_thread_fence.c [builtins] One more stab at fixing the bots. 2015-09-22 22:19:21 +00:00
bswapdi2.c Add generic __bswap[ds]i2 implementations 2017-05-25 14:52:14 +00:00
bswapsi2.c Add generic __bswap[ds]i2 implementations 2017-05-25 14:52:14 +00:00
clear_cache.c [builtins] Align addresses to cache lines in __clear_cache for aarch64 2018-01-24 10:14:52 +00:00
clzdi2.c [builtins] Workaround for infinite recursion in c?zdi2 2018-02-08 11:14:11 +00:00
clzsi2.c
clzti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
cmpdi2.c
cmpti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
comparedf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
comparesf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
comparetf2.c builtins: restrict aliases 2015-08-21 04:39:52 +00:00
cpu_model.c [compiler-rt][X86][AMD][Bulldozer] Fix Bulldozer Model 2 detection. 2018-05-01 18:40:15 +00:00
ctzdi2.c [builtins] Workaround for infinite recursion in c?zdi2 2018-02-08 11:14:11 +00:00
ctzsi2.c
ctzti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
divdc3.c builtins: emulate _Complex for cl 2015-10-07 02:58:07 +00:00
divdf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
divdi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
divmoddi4.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
divmodsi4.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
divsc3.c builtins: emulate _Complex for cl 2015-10-07 02:58:07 +00:00
divsf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
divsi3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
divtc3.c [builtins] Fix MSVC build 2017-04-07 16:54:32 +00:00
divtf3.c Implement __divtf3 for IEEE quad precision. 2014-05-30 11:08:18 +00:00
divti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
divxc3.c builtins: emulate _Complex for cl 2015-10-07 02:58:07 +00:00
emutls.c [builtins] Use Interlocked* intrinsics for atomics on MSVC 2017-08-03 19:04:28 +00:00
enable_execute_stack.c [builtins] Fix mingw-w64 cross compilation 2017-07-31 06:01:39 +00:00
eprintf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
extenddftf2.c Add __extenddftf2 and __extendsftf2 for IEEE quad precision. 2014-05-29 01:00:39 +00:00
extendhfsf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
extendsfdf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
extendsftf2.c Add __extenddftf2 and __extendsftf2 for IEEE quad precision. 2014-05-29 01:00:39 +00:00
ffsdi2.c
ffssi2.c Add __ffssi2 implementation to compiler-rt builtins 2017-04-06 18:12:02 +00:00
ffsti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
fixdfdi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixdfsi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixdfti.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixsfdi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixsfsi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixsfti.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixtfdi.c Use signed int implementation for __fixint 2015-03-12 21:36:10 +00:00
fixtfsi.c Use signed int implementation for __fixint 2015-03-12 21:36:10 +00:00
fixtfti.c Use signed int implementation for __fixint 2015-03-12 21:36:10 +00:00
fixunsdfdi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixunsdfsi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixunsdfti.c Fix float->uint conversion for inputs less than 0 2015-05-01 16:02:16 +00:00
fixunssfdi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixunssfsi.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
fixunssfti.c We want single precision here. 2015-03-13 00:18:28 +00:00
fixunstfdi.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixunstfsi.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixunstfti.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixunsxfdi.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixunsxfsi.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixunsxfti.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixxfdi.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
fixxfti.c Refactor float to integer conversion to share the same code. 2015-03-11 21:13:56 +00:00
floatdidf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatdisf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatditf.c [compiler-rt] Add AArch64 to CMake configuration and several missing builtins 2015-08-18 13:43:37 +00:00
floatdixf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
floatsidf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatsisf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatsitf.c Fix __floatsitf() for negative input 2015-07-31 13:32:09 +00:00
floattidf.c [builtins] replace tabs by spaces and remove whitespace at end of line NFC 2016-06-13 15:21:04 +00:00
floattisf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
floattitf.c [builtins] Implement __floattitf() & __floatuntitf() 2017-01-06 18:46:35 +00:00
floattixf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
floatundidf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatundisf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatunditf.c [compiler-rt] Add AArch64 to CMake configuration and several missing builtins 2015-08-18 13:43:37 +00:00
floatundixf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
floatunsidf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatunsisf.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
floatunsitf.c Implement floatsitf, floatunstfsi, which perform 2014-09-16 20:34:41 +00:00
floatuntidf.c [builtins] replace tabs by spaces and remove whitespace at end of line NFC 2016-06-13 15:21:04 +00:00
floatuntisf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
floatuntitf.c [builtins] Implement __floattitf() & __floatuntitf() 2017-01-06 18:46:35 +00:00
floatuntixf.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
fp_add_impl.inc builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_extend.h builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_extend_impl.inc builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_fixint_impl.inc builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_fixuint_impl.inc [compiler-rt][aarch64] New tests for 128-bit floating-point builtins, fixes of tests and __fixuint 2015-11-05 18:36:42 +00:00
fp_lib.h builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_mul_impl.inc builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_trunc.h builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
fp_trunc_impl.inc builtins: spell inline as __inline 2015-10-10 21:21:28 +00:00
gcc_personality_v0.c Revert "builtins: erase `struct` modifier for EH personality" 2017-08-22 04:19:51 +00:00
int_endianness.h Remove Bitrig: CompilerRT Changes 2017-07-21 22:47:46 +00:00
int_lib.h [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
int_math.h builtins: fix build 2015-10-07 03:30:02 +00:00
int_types.h [PATCH] [compiler-rt, RISCV] Support builtins for RISC-V 2018-03-01 07:47:27 +00:00
int_util.c [builtins] Better Fuchsia support 2017-07-12 19:33:30 +00:00
int_util.h builtins: Use MSVC-equivalents of attributes 2015-10-06 04:33:05 +00:00
lshrdi3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
lshrti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mingw_fixfloat.c builtins: Allow building windows arm functions for mingw 2016-11-19 21:22:38 +00:00
moddi3.c Avoid type pruning. 2014-03-01 15:32:05 +00:00
modsi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
modti3.c Avoid type pruning. 2014-03-01 15:32:05 +00:00
muldc3.c builtins: emulate _Complex for cl 2015-10-07 02:58:07 +00:00
muldf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
muldi3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
mulodi4.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mulosi4.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
muloti4.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mulsc3.c builtins: emulate _Complex for cl 2015-10-07 02:58:07 +00:00
mulsf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
multc3.c [compiler-rt] Add AArch64 to CMake configuration and several missing builtins 2015-08-18 13:43:37 +00:00
multf3.c Provide mul for IEEE quad. From GuanHong Liu. 2014-06-19 20:34:03 +00:00
multi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mulvdi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mulvsi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mulvti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
mulxc3.c builtins: emulate _Complex for cl 2015-10-07 02:58:07 +00:00
negdf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
negdi2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
negsf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
negti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
negvdi2.c
negvsi2.c
negvti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
os_version_check.c [compiler-rt][builtins] Ignore the deprecated warning for 2017-03-15 12:13:20 +00:00
paritydi2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
paritysi2.c
parityti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
popcountdi2.c
popcountsi2.c
popcountti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
powidf2.c
powisf2.c
powitf2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
powixf2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
subdf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
subsf3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
subtf3.c Provide add and sub for IEEE quad. From GuanHong Liu. 2014-06-19 20:24:49 +00:00
subvdi3.c PR21518: Use unsigned arithmetic for trapping add/sub functions. 2014-11-12 23:01:24 +00:00
subvsi3.c PR21518: Use unsigned arithmetic for trapping add/sub functions. 2014-11-12 23:01:24 +00:00
subvti3.c PR21518: Use unsigned arithmetic for trapping add/sub functions. 2014-11-12 23:01:24 +00:00
trampoline_setup.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
truncdfhf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
truncdfsf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
truncsfhf2.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
trunctfdf2.c Implement __trunctfdf2 and __trunctfsf2 for IEEE quad precision. 2014-05-29 00:58:27 +00:00
trunctfsf2.c Implement __trunctfdf2 and __trunctfsf2 for IEEE quad precision. 2014-05-29 00:58:27 +00:00
ucmpdi2.c
ucmpti2.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
udivdi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
udivmoddi4.c Don't take short cuts trying to avoid conditionals. This leads to 2014-03-18 22:10:36 +00:00
udivmodsi4.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
udivmodti4.c Don't take short cuts trying to avoid conditionals. This leads to 2014-03-18 22:10:36 +00:00
udivsi3.c [compiler-rt] Add back ARM EABI aliases where legal. 2017-10-03 21:25:07 +00:00
udivti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
umoddi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
umodsi3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
umodti3.c Consistently use COMPILER_RT_ABI for all public symbols. 2014-03-01 15:30:50 +00:00
unwind-ehabi-helpers.h builtins: repair the builtins build with clang 3.8 2016-11-18 18:21:06 +00:00

README.txt

Compiler-RT
================================

This directory and its subdirectories contain source code for the compiler
support routines.

Compiler-RT is open source software. You may freely distribute it under the
terms of the license agreement found in LICENSE.txt.

================================

This is a replacement library for libgcc.  Each function is contained
in its own file.  Each function has a corresponding unit test under
test/Unit.

A rudimentary script to test each file is in the file called
test/Unit/test.

Here is the specification for this library:

http://gcc.gnu.org/onlinedocs/gccint/Libgcc.html#Libgcc

Here is a synopsis of the contents of this library:

typedef      int si_int;
typedef unsigned su_int;

typedef          long long di_int;
typedef unsigned long long du_int;

// Integral bit manipulation

di_int __ashldi3(di_int a, si_int b);      // a << b
ti_int __ashlti3(ti_int a, si_int b);      // a << b

di_int __ashrdi3(di_int a, si_int b);      // a >> b  arithmetic (sign fill)
ti_int __ashrti3(ti_int a, si_int b);      // a >> b  arithmetic (sign fill)
di_int __lshrdi3(di_int a, si_int b);      // a >> b  logical    (zero fill)
ti_int __lshrti3(ti_int a, si_int b);      // a >> b  logical    (zero fill)

si_int __clzsi2(si_int a);  // count leading zeros
si_int __clzdi2(di_int a);  // count leading zeros
si_int __clzti2(ti_int a);  // count leading zeros
si_int __ctzsi2(si_int a);  // count trailing zeros
si_int __ctzdi2(di_int a);  // count trailing zeros
si_int __ctzti2(ti_int a);  // count trailing zeros

si_int __ffssi2(si_int a);  // find least significant 1 bit
si_int __ffsdi2(di_int a);  // find least significant 1 bit
si_int __ffsti2(ti_int a);  // find least significant 1 bit

si_int __paritysi2(si_int a);  // bit parity
si_int __paritydi2(di_int a);  // bit parity
si_int __parityti2(ti_int a);  // bit parity

si_int __popcountsi2(si_int a);  // bit population
si_int __popcountdi2(di_int a);  // bit population
si_int __popcountti2(ti_int a);  // bit population

uint32_t __bswapsi2(uint32_t a);   // a byteswapped
uint64_t __bswapdi2(uint64_t a);   // a byteswapped

// Integral arithmetic

di_int __negdi2    (di_int a);                         // -a
ti_int __negti2    (ti_int a);                         // -a
di_int __muldi3    (di_int a, di_int b);               // a * b
ti_int __multi3    (ti_int a, ti_int b);               // a * b
si_int __divsi3    (si_int a, si_int b);               // a / b   signed
di_int __divdi3    (di_int a, di_int b);               // a / b   signed
ti_int __divti3    (ti_int a, ti_int b);               // a / b   signed
su_int __udivsi3   (su_int n, su_int d);               // a / b   unsigned
du_int __udivdi3   (du_int a, du_int b);               // a / b   unsigned
tu_int __udivti3   (tu_int a, tu_int b);               // a / b   unsigned
si_int __modsi3    (si_int a, si_int b);               // a % b   signed
di_int __moddi3    (di_int a, di_int b);               // a % b   signed
ti_int __modti3    (ti_int a, ti_int b);               // a % b   signed
su_int __umodsi3   (su_int a, su_int b);               // a % b   unsigned
du_int __umoddi3   (du_int a, du_int b);               // a % b   unsigned
tu_int __umodti3   (tu_int a, tu_int b);               // a % b   unsigned
du_int __udivmoddi4(du_int a, du_int b, du_int* rem);  // a / b, *rem = a % b  unsigned
tu_int __udivmodti4(tu_int a, tu_int b, tu_int* rem);  // a / b, *rem = a % b  unsigned
su_int __udivmodsi4(su_int a, su_int b, su_int* rem);  // a / b, *rem = a % b  unsigned
si_int __divmodsi4(si_int a, si_int b, si_int* rem);   // a / b, *rem = a % b  signed



//  Integral arithmetic with trapping overflow

si_int __absvsi2(si_int a);           // abs(a)
di_int __absvdi2(di_int a);           // abs(a)
ti_int __absvti2(ti_int a);           // abs(a)

si_int __negvsi2(si_int a);           // -a
di_int __negvdi2(di_int a);           // -a
ti_int __negvti2(ti_int a);           // -a

si_int __addvsi3(si_int a, si_int b);  // a + b
di_int __addvdi3(di_int a, di_int b);  // a + b
ti_int __addvti3(ti_int a, ti_int b);  // a + b

si_int __subvsi3(si_int a, si_int b);  // a - b
di_int __subvdi3(di_int a, di_int b);  // a - b
ti_int __subvti3(ti_int a, ti_int b);  // a - b

si_int __mulvsi3(si_int a, si_int b);  // a * b
di_int __mulvdi3(di_int a, di_int b);  // a * b
ti_int __mulvti3(ti_int a, ti_int b);  // a * b


// Integral arithmetic which returns if overflow

si_int __mulosi4(si_int a, si_int b, int* overflow);  // a * b, overflow set to one if result not in signed range
di_int __mulodi4(di_int a, di_int b, int* overflow);  // a * b, overflow set to one if result not in signed range
ti_int __muloti4(ti_int a, ti_int b, int* overflow);  // a * b, overflow set to
 one if result not in signed range


//  Integral comparison: a  < b -> 0
//                       a == b -> 1
//                       a  > b -> 2

si_int __cmpdi2 (di_int a, di_int b);
si_int __cmpti2 (ti_int a, ti_int b);
si_int __ucmpdi2(du_int a, du_int b);
si_int __ucmpti2(tu_int a, tu_int b);

//  Integral / floating point conversion

di_int __fixsfdi(      float a);
di_int __fixdfdi(     double a);
di_int __fixxfdi(long double a);

ti_int __fixsfti(      float a);
ti_int __fixdfti(     double a);
ti_int __fixxfti(long double a);
uint64_t __fixtfdi(long double input);  // ppc only, doesn't match documentation

su_int __fixunssfsi(      float a);
su_int __fixunsdfsi(     double a);
su_int __fixunsxfsi(long double a);

du_int __fixunssfdi(      float a);
du_int __fixunsdfdi(     double a);
du_int __fixunsxfdi(long double a);

tu_int __fixunssfti(      float a);
tu_int __fixunsdfti(     double a);
tu_int __fixunsxfti(long double a);
uint64_t __fixunstfdi(long double input);  // ppc only

float       __floatdisf(di_int a);
double      __floatdidf(di_int a);
long double __floatdixf(di_int a);
long double __floatditf(int64_t a);        // ppc only

float       __floattisf(ti_int a);
double      __floattidf(ti_int a);
long double __floattixf(ti_int a);

float       __floatundisf(du_int a);
double      __floatundidf(du_int a);
long double __floatundixf(du_int a);
long double __floatunditf(uint64_t a);     // ppc only

float       __floatuntisf(tu_int a);
double      __floatuntidf(tu_int a);
long double __floatuntixf(tu_int a);

//  Floating point raised to integer power

float       __powisf2(      float a, si_int b);  // a ^ b
double      __powidf2(     double a, si_int b);  // a ^ b
long double __powixf2(long double a, si_int b);  // a ^ b
long double __powitf2(long double a, si_int b);  // ppc only, a ^ b

//  Complex arithmetic

//  (a + ib) * (c + id)

      float _Complex __mulsc3( float a,  float b,  float c,  float d);
     double _Complex __muldc3(double a, double b, double c, double d);
long double _Complex __mulxc3(long double a, long double b,
                              long double c, long double d);
long double _Complex __multc3(long double a, long double b,
                              long double c, long double d); // ppc only

//  (a + ib) / (c + id)

      float _Complex __divsc3( float a,  float b,  float c,  float d);
     double _Complex __divdc3(double a, double b, double c, double d);
long double _Complex __divxc3(long double a, long double b,
                              long double c, long double d);
long double _Complex __divtc3(long double a, long double b,
                              long double c, long double d);  // ppc only


//         Runtime support

// __clear_cache() is used to tell process that new instructions have been
// written to an address range.  Necessary on processors that do not have
// a unified instruction and data cache.
void __clear_cache(void* start, void* end);

// __enable_execute_stack() is used with nested functions when a trampoline
// function is written onto the stack and that page range needs to be made
// executable.
void __enable_execute_stack(void* addr);

// __gcc_personality_v0() is normally only called by the system unwinder.
// C code (as opposed to C++) normally does not need a personality function
// because there are no catch clauses or destructors to be run.  But there
// is a C language extension __attribute__((cleanup(func))) which marks local
// variables as needing the cleanup function "func" to be run when the
// variable goes out of scope.  That includes when an exception is thrown,
// so a personality handler is needed.  
_Unwind_Reason_Code __gcc_personality_v0(int version, _Unwind_Action actions,
         uint64_t exceptionClass, struct _Unwind_Exception* exceptionObject,
         _Unwind_Context_t context);

// for use with some implementations of assert() in <assert.h>
void __eprintf(const char* format, const char* assertion_expression,
				const char* line, const char* file);

// for systems with emulated thread local storage
void* __emutls_get_address(struct __emutls_control*);


//   Power PC specific functions

// There is no C interface to the saveFP/restFP functions.  They are helper
// functions called by the prolog and epilog of functions that need to save
// a number of non-volatile float point registers.  
saveFP
restFP

// PowerPC has a standard template for trampoline functions.  This function
// generates a custom trampoline function with the specific realFunc
// and localsPtr values.
void __trampoline_setup(uint32_t* trampOnStack, int trampSizeAllocated, 
                                const void* realFunc, void* localsPtr);

// adds two 128-bit double-double precision values ( x + y )
long double __gcc_qadd(long double x, long double y);  

// subtracts two 128-bit double-double precision values ( x - y )
long double __gcc_qsub(long double x, long double y); 

// multiples two 128-bit double-double precision values ( x * y )
long double __gcc_qmul(long double x, long double y);  

// divides two 128-bit double-double precision values ( x / y )
long double __gcc_qdiv(long double a, long double b);  


//    ARM specific functions

// There is no C interface to the switch* functions.  These helper functions
// are only needed by Thumb1 code for efficient switch table generation.
switch16
switch32
switch8
switchu8

// There is no C interface to the *_vfp_d8_d15_regs functions.  There are
// called in the prolog and epilog of Thumb1 functions.  When the C++ ABI use
// SJLJ for exceptions, each function with a catch clause or destuctors needs
// to save and restore all registers in it prolog and epliog.  But there is 
// no way to access vector and high float registers from thumb1 code, so the 
// compiler must add call outs to these helper functions in the prolog and 
// epilog.
restore_vfp_d8_d15_regs
save_vfp_d8_d15_regs


// Note: long ago ARM processors did not have floating point hardware support.
// Floating point was done in software and floating point parameters were 
// passed in integer registers.  When hardware support was added for floating
// point, new *vfp functions were added to do the same operations but with 
// floating point parameters in floating point registers.

// Undocumented functions

float  __addsf3vfp(float a, float b);   // Appears to return a + b
double __adddf3vfp(double a, double b); // Appears to return a + b
float  __divsf3vfp(float a, float b);   // Appears to return a / b
double __divdf3vfp(double a, double b); // Appears to return a / b
int    __eqsf2vfp(float a, float b);    // Appears to return  one
                                        //     iff a == b and neither is NaN.
int    __eqdf2vfp(double a, double b);  // Appears to return  one
                                        //     iff a == b and neither is NaN.
double __extendsfdf2vfp(float a);       // Appears to convert from
                                        //     float to double.
int    __fixdfsivfp(double a);          // Appears to convert from
                                        //     double to int.
int    __fixsfsivfp(float a);           // Appears to convert from
                                        //     float to int.
unsigned int __fixunssfsivfp(float a);  // Appears to convert from
                                        //     float to unsigned int.
unsigned int __fixunsdfsivfp(double a); // Appears to convert from
                                        //     double to unsigned int.
double __floatsidfvfp(int a);           // Appears to convert from
                                        //     int to double.
float __floatsisfvfp(int a);            // Appears to convert from
                                        //     int to float.
double __floatunssidfvfp(unsigned int a); // Appears to convert from
                                        //     unisgned int to double.
float __floatunssisfvfp(unsigned int a); // Appears to convert from
                                        //     unisgned int to float.
int __gedf2vfp(double a, double b);     // Appears to return __gedf2
                                        //     (a >= b)
int __gesf2vfp(float a, float b);       // Appears to return __gesf2
                                        //     (a >= b)
int __gtdf2vfp(double a, double b);     // Appears to return __gtdf2
                                        //     (a > b)
int __gtsf2vfp(float a, float b);       // Appears to return __gtsf2
                                        //     (a > b)
int __ledf2vfp(double a, double b);     // Appears to return __ledf2
                                        //     (a <= b)
int __lesf2vfp(float a, float b);       // Appears to return __lesf2
                                        //     (a <= b)
int __ltdf2vfp(double a, double b);     // Appears to return __ltdf2
                                        //     (a < b)
int __ltsf2vfp(float a, float b);       // Appears to return __ltsf2
                                        //     (a < b)
double __muldf3vfp(double a, double b); // Appears to return a * b
float __mulsf3vfp(float a, float b);    // Appears to return a * b
int __nedf2vfp(double a, double b);     // Appears to return __nedf2
                                        //     (a != b)
double __negdf2vfp(double a);           // Appears to return -a
float __negsf2vfp(float a);             // Appears to return -a
float __negsf2vfp(float a);             // Appears to return -a
double __subdf3vfp(double a, double b); // Appears to return a - b
float __subsf3vfp(float a, float b);    // Appears to return a - b
float __truncdfsf2vfp(double a);        // Appears to convert from
                                        //     double to float.
int __unorddf2vfp(double a, double b);  // Appears to return __unorddf2
int __unordsf2vfp(float a, float b);    // Appears to return __unordsf2


Preconditions are listed for each function at the definition when there are any.
Any preconditions reflect the specification at
http://gcc.gnu.org/onlinedocs/gccint/Libgcc.html#Libgcc.

Assumptions are listed in "int_lib.h", and in individual files.  Where possible
assumptions are checked at compile time.