This
define float @foo(float %x, float %y) nounwind readnone {
entry:
%0 = tail call float @copysignf(float %x, float %y) nounwind readnone
ret float %0
}
Was compiled to:
vmov s0, r1
bic r0, r0, #-2147483648
vmov s1, r0
vcmpe.f32 s0, #0
vmrs apsr_nzcv, fpscr
it lt
vneglt.f32 s1, s1
vmov r0, s1
bx lr
This fails to copy the sign of -0.0f because it's lost during the float to int
conversion. Also, it's sub-optimal when the inputs are in GPR registers.
Now it uses integer and + or operations when it's profitable. And it's correct!
lsrs r1, r1, #31
bfi r0, r1, #31, #1
bx lr
rdar://8984306
llvm-svn: 125357
causing the deserialization of a large number of declarations when
writing the visible-updates record for the translation unit in C. This
takes us from:
*** AST File Statistics:
2 stat cache hits
6 stat cache misses
1/64463 source location entries read (0.001551%)
15606/16956 types read (92.038216%)
59266/89334 declarations read (66.342041%)
38952/61393 identifiers read (63.446976%)
0/7778 selectors read (0.000000%)
24192/34644 statements read (69.830276%)
388/8809 macros read (4.404586%)
2095/5189 lexical declcontexts read (40.373867%)
0/4587 visible declcontexts read (0.000000%)
0/7716 method pool entries read (0.000000%)
0 method pool misses
to
*** AST File Statistics:
2 stat cache hits
6 stat cache misses
1/64463 source location entries read (0.001551%)
26/16956 types read (0.153338%)
18/89334 declarations read (0.020149%)
145/61393 identifiers read (0.236183%)
0/7778 selectors read (0.000000%)
21/34644 statements read (0.060617%)
0/8809 macros read (0.000000%)
0/5189 lexical declcontexts read (0.000000%)
0/4587 visible declcontexts read (0.000000%)
0/7716 method pool entries read (0.000000%)
0 method pool misses
when generating a chained PCH for a header that #includes Cocoa.h
(from a PCH file) and adds one simple function declaration. The
generated PCH file is now only 9580 bytes (down from > 2MB).
llvm-svn: 125326
we would deserialize all of the macro definitions we knew about while
serializing the macro definitions at the end of the AST/PCH file. Even
though we skipped most of them (since they were unchanged), it's still
a performance problem.
Now, we do the standard AST/PCH chaining trick: watch what identifiers
are deserialized as macro names, and consider only those identifiers
(along with macro definitions that have been deserialized/written in
the source) when serializing the preprocessor state.
llvm-svn: 125324
- Objective-C constant strings were being
NULL-terminated erroneously.
- Empty Objective-C constant strings were not
being generated correctly.
Also added the template for a test of these
fixes.
llvm-svn: 125314
AST/PCH files more lazy:
- Don't preload all of the file source-location entries when reading
the AST file. Instead, load them lazily, when needed.
- Only look up header-search information (whether a header was already
#import'd, how many times it's been included, etc.) when it's needed
by the preprocessor, rather than pre-populating it.
Previously, we would pre-load all of the file source-location entries,
which also populated the header-search information structure. This was
a relatively minor performance issue, since we would end up stat()'ing
all of the headers stored within a AST/PCH file when the AST/PCH file
was loaded. In the normal PCH use case, the stat()s were cached, so
the cost--of preloading ~860 source-location entries in the Cocoa.h
case---was relatively low.
However, the recent optimization that replaced stat+open with
open+fstat turned this into a major problem, since the preloading of
source-location entries would now end up opening those files. Worse,
those files wouldn't be closed until the file manager was destroyed,
so just opening a Cocoa.h PCH file would hold on to ~860 file
descriptors, and it was easy to blow through the process's limit on
the number of open file descriptors.
By eliminating the preloading of these files, we neither open nor stat
the headers stored in the PCH/AST file until they're actually needed
for something. Concretely, we went from
*** HeaderSearch Stats:
835 files tracked.
364 #import/#pragma once files.
823 included exactly once.
6 max times a file is included.
3 #include/#include_next/#import.
0 #includes skipped due to the multi-include optimization.
1 framework lookups.
0 subframework lookups.
*** Source Manager Stats:
835 files mapped, 3 mem buffers mapped.
37460 SLocEntry's allocated, 11215575B of Sloc address space used.
62 bytes of files mapped, 0 files with line #'s computed.
with a trivial program that uses a chained PCH including a Cocoa PCH
to
*** HeaderSearch Stats:
4 files tracked.
1 #import/#pragma once files.
3 included exactly once.
2 max times a file is included.
3 #include/#include_next/#import.
0 #includes skipped due to the multi-include optimization.
1 framework lookups.
0 subframework lookups.
*** Source Manager Stats:
3 files mapped, 3 mem buffers mapped.
37460 SLocEntry's allocated, 11215575B of Sloc address space used.
62 bytes of files mapped, 0 files with line #'s computed.
for the same program.
llvm-svn: 125286