update our bragging about diagnostics. :)

llvm-svn: 67289
2009-03-19 06:52:51 +00:00 · 2009-03-19 06:52:51 +00:00 · 2b7dbf4d0a
parent 063699af70
commit 2b7dbf4d0a
2 changed files with 177 additions and 22 deletions
--- a/clang/www/feature-diagnostics1.png
+++ b/clang/www/feature-diagnostics1.png
--- a/clang/www/features.html
+++ b/clang/www/features.html
@ -133,33 +133,188 @@ to tap the full potential of the clang design.</p>
 <h3><a name="expressivediags">Expressive Diagnostics</a></h3>
 <!--=======================================================================-->

-<p>Clang is designed to efficiently capture range information for expressions
-and statements, which allows it to emit very useful and detailed diagnostic
-information (e.g. warnings and errors) when a problem is detected.</p>
+<p>In addition to being fast and functional, we aim to make Clang extremely user
+friendly.  As far as a command-line compiler goes, this basically boils down to
+making the diagnostics (error and warning messages) generated by the compiler
+be as useful as possible.  There are several ways that we do this.  This section
+talks about the experience provided by the command line compiler, contrasting
+Clang output to GCC 4.2's output in several examples.
+<!--
+Other clients
+that embed Clang and extract equivalent information through internal APIs.-->
+</p>

-<p>For example, this slide compares the diagnostics emitted by clang (top) to
-the diagnostics emitted by GCC (middle) for a simple example:</p>
+<h4>Column Numbers and Caret Diagnostics</h4>

-<img class="img_slide" src="feature-diagnostics1.png" width="400" height="300"/>
+<p>First, all diagnostics produced by clang include full column number
+information, and use this to print "caret diagnostics".  This is a feature
+provided by many commercial compilers, but is generally missing from open source
+compilers.  This is nice because it makes it very easy to understand exactly
+what is wrong in a particular piece of code, an example is:</p>

-<p>As you can see, clang goes beyond tracking just column number information: it
-is able to highlight the subexpressions involved in a problem, making it much
-easier to understand the source of the problem in many cases.  For example, in
-the first problem, it tells you <em>why</em> the operand is invalid (it
-requires a pointer) and what type it really is.</p>
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b>
+  format-strings.c:91: warning: too few arguments for format
+  $ <b>clang -fsyntax-only format-strings.c</b>
+  format-strings.c:91:13: warning: '.*' specified field precision is missing a matching 'int' argument
+  <font color="darkgreen">  printf("%.*d");</font>
+  <font color="blue">            ^</font>
+</pre>

-<p>In the second error, you can see how clang uses column number information to
-identify exactly which "+" out of the four on that line is causing the problem.
-Further, it highlights the subexpressions involved, which can be very useful
-when a complex subexpression that relies on tricky precedence rules.</p>
+<p>The caret (the blue "^" character) exactly shows where the problem is, even
+inside of the string.  This makes it really easy to jump to the problem and
+helps when multiple instances of the same character occur on a line.  We'll
+revisit this more in following examples.</p>

-<p>The example doesn't show it, but clang works very hard to retain typedef
-information, ensuring that diagnostics print the user types, not the fully
-expanded (and often huge) types.  This is clearly important for C++ code (tell
-me about "<tt>std::string</tt>", not about "<tt>std::basic_string&lt;char, 
-std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;</tt>"!), but it is
-also very useful in C code in some cases as well (e.g. "<tt>__m128"</tt> vs
-"<tt>float __attribute__((__vector_size__(16)))</tt>").</p>
+<h4>Range Highlighting for Related Text</h4>
+
+<p>Clang captures and accurately tracks range information for expressions,
+statements, and other constructs in your program and uses this to make
+diagnostics highlight related information.  For example, here's a somewhat
+nonsensical example to illustrate this:</p>
+
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only t.c</b>
+  t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:7:39: error: invalid operands to binary expression ('int' and 'struct A')
+  <font color="darkgreen">  return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</font>
+  <font color="blue">                       ~~~~~~~~~~~~~~ ^ ~~~~~</font>
+</pre>
+
+<p>Here you can see that you don't even need to see the original source code to
+understand what is wrong based on the Clang error: Because clang prints a
+caret, you know exactly <em>which</em> plus it is complaining about.  The range
+information highlights the left and right side of the plus which makes it
+immediately obvious what the compiler is talking about, which is very useful for
+cases involving precedence issues and many other cases.</p>
+
+<h4>Precision in Wording</h4>
+
+<p>A detail is that we have tried really hard to make the diagnostics that come
+out of clang contain exactly the pertinent information about what is wrong and
+why.  In the example above, we tell you what the inferred types are for
+the left and right hand sides, and we don't repeat what is obvious from the
+caret (that this is a "binary +").  Many other examples abound, here is a simple
+one:</p>
+
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only t.c</b>
+  t.c:5: error: invalid type argument of 'unary *'
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:5:11: error: indirection requires pointer operand ('int' invalid)
+  <font color="darkgreen">  int y = *SomeA.X;</font>
+  <font color="blue">          ^~~~~~~~</font>
+</pre>
+
+<p>In this example, not only do we tell you that there is a problem with the *
+and point to it, we say exactly why and tell you what the type is (in case it is
+a complicated subexpression, such as a call to an overloaded function).  This
+sort of attention to detail makes it much easier to understand and fix problems
+quickly.</p>
+
+<h4>No Pretty Printing of Expressions in Diagnostics</h4>
+
+<p>Since Clang has range highlighting, it never needs to pretty print your code
+back out to you.  This is particularly bad in G++ (which often emits errors
+containing lowered vtable references), but even GCC can produce
+inscrutible error messages in some cases when it tries to do this.  In this
+example P and Q have type "int*":</p>
+
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only t.c</b>
+  #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object  is not a function
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:12:8: error: called object type 'int' is not a function or function pointer
+  <font color="darkgreen">  (P-Q)();</font>
+  <font color="blue">  ~~~~~^</font>
+</pre>
+
+
+<h4>Typedef Preservation and Selective Unwrapping</h4>
+
+<p>Many programmers use high-level user defined types, typedefs, and other
+syntactic sugar to refer to types in their program.  This is useful because they
+can abbreviate otherwise very long types and it is useful to preserve the
+typename in diagnostics.  However, sometimes very simple typedefs can wrap
+trivial types and it is important to strip off the typedef to understand what
+is going on.  Clang aims to handle both cases well.<p>
+
+<p>For example, here is an example that shows where it is important to preserve
+a typedef in C:</p>
+
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only t.c</b>
+  t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *')
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:15:11: error: can't convert between vector values of different size ('__m128' and 'int const *')
+  <font color="darkgreen">  myvec[1]/P;</font>
+  <font color="blue">  ~~~~~~~~^~</font>
+</pre>
+
+<p>Here the type printed by GCC isn't even valid, but if the error were about a
+very long and complicated type (as often happens in C++) the error message would
+be ugly just because it was long and hard to read.  Here's an example where it
+is useful for the compiler to expose underlying details of a typedef:</p>
+
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only t.c</b>
+  t.c:13: error: request for member 'x' in something not a structure or union
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:13:9: error: member reference base type 'pid_t' (aka 'int') is not a structure or union
+  <font color="darkgreen">  myvar = myvar.x;</font>
+  <font color="blue">          ~~~~~ ^</font>
+</pre>
+
+<p>If the user was somehow confused about how the system "pid_t" typedef is
+defined, Clang helpfully displays it with "aka".</p>
+
+<h4>Automatic Macro Expansion</h4>
+
+<p>Many errors happen in macros that are sometimes deeply nested.  With
+traditional compilers, you need to dig deep into the definition of the macro to
+understand how you got into trouble.  Here's a simple example that shows how
+Clang helps you out:</p>
+
+<pre>
+  $ <b>gcc-4.2 -fsyntax-only t.c</b>
+  t.c: In function 'test':
+  t.c:80: error: invalid operands to binary &lt; (have 'struct mystruct' and 'float')
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:80:3: error: invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float'))
+  <font color="darkgreen">  X = MYMAX(P, F);</font>
+  <font color="blue">      ^~~~~~~~~~~</font>
+  t.c:76:94: note: instantiated from:
+  <font color="darkgreen">#define MYMAX(A,B)    __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a &lt; __b ? __b : __a; })</font>
+  <font color="blue">                                                                                         ~~~ ^ ~~~</font>
+</pre>
+
+<p>This shows how clang automatically prints instantiation information and
+nested range information for diagnostics as they are instantiated through macros
+and also shows how some of the other pieces work in a bigger example.  Here's
+another real world warning that occurs in the "window" Unix package (which
+implements the "wwopen" class of APIs):</p>
+
+<pre>
+  $ <b>clang -fsyntax-only t.c</b>
+  t.c:22:2: warning: type specifier missing, defaults to 'int'
+  <font color="darkgreen">        ILPAD();</font>
+  <font color="blue">        ^</font>
+  t.c:17:17: note: instantiated from:
+  <font color="darkgreen">#define ILPAD() PAD((NROW - tt.tt_row) * 10)    /* 1 ms per char */</font>
+  <font color="blue">                ^</font>
+  t.c:14:2: note: instantiated from:
+  <font color="darkgreen">        register i; \</font>
+  <font color="blue">        ^</font>
+</pre>
+
+<p>In practice, we've found that this is actually more useful in multiply nested
+macros that in simple ones.</p>
+
+
+<h4>C++ Fun Examples</h4>
+
+<p>...</p>

 <!--=======================================================================-->
 <h3><a name="gcccompat">GCC Compatibility</a></h3>