Unify terms regarding assembly form to use generic vs. custom

This CL just changes various docs and comments to use the term "generic" and "custom" when mentioning assembly forms. To be consist, several methods are also renamed: * FunctionParser::parseVerboseOperation() -> parseGenericOperation() * ModuleState::hasShorthandForm() -> hasCustomForm() * OpAsmPrinter::printDefaultOp() -> printGenericOp() PiperOrigin-RevId: 230568819
2019-01-23 11:26:56 -08:00 · 2019-01-23 11:26:56 -08:00 · 5654450853
parent b28009b681
commit 5654450853
11 changed files with 100 additions and 77 deletions
--- a/mlir/g3doc/LangRef.md
+++ b/mlir/g3doc/LangRef.md
@ -1029,9 +1029,9 @@ and the bound is the maximum/minimum of the returned values. There is no
 semantic ambiguity, but MLIR syntax requires the use of these keywords to make
 things more obvious to human readers.

-Many upper and lower bounds are simple, so MLIR accepts two shorthand syntaxes:
-the form that accepts a single 'ssa-id' (e.g. `%N`) is shorthand for applying
-that SSA value to a function that maps a single symbol to itself, e.g.,
+Many upper and lower bounds are simple, so MLIR accepts two custom form
+syntaxes: the form that accepts a single 'ssa-id' (e.g. `%N`) is shorthand for
+applying that SSA value to a function that maps a single symbol to itself, e.g.,
 `()[s]->(s)()[%N]`. The integer literal form (e.g. `-42`) is shorthand for a
 nullary mapping function that returns the constant value (e.g. `()->(-42)()`).

@ -1118,8 +1118,8 @@ The internal representation of an operation is simple: an operation is
 identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`,
 `ppc.eieio`, etc), can return zero or more results, take zero or more SSA
 operands, and may have zero or more attributes. When parsed or printed in the
-raw form, these are all printed literally, and a function type is used to
-indicate the types of the results and operands.
+_generic assembly form_, these are all printed literally, and a function type is
+used to indicate the types of the results and operands.

 Example:

@ -1145,9 +1145,10 @@ Example:
 "br_cond"(%cond)[^bb1, ^bb2(%v : index)] : (i1) -> ()
 ```

-In addition to the basic syntax above, applications may register tables of known
-operations. This allows those applications to support custom syntax for parsing
-and printing operations. In the operation sets listed below, we show both forms.
+In addition to the basic syntax above, dialects may register tables of known
+operations. This allows those dialects to support _custom assembly form_ for
+parsing and printing operations. In the operation sets listed below, we show
+both forms.

 **Context:** TensorFlow has an open "op" ecosystem, and we directly apply these
 ideas to the design of MLIR, but generalize it much further. To make it easy to
@ -1326,7 +1327,7 @@ Examples:
 // Returns the dynamic dimension of %A.
 %y = dim %A, 1 : tensor<4 x ? x f32>

-// Equivalent longhand form:
+// Equivalent generic form:
 %x = "dim"(%A){index: 0} : (tensor<4 x ? x f32>) -> index
 %y = "dim"(%A){index: 1} : (tensor<4 x ? x f32>) -> index
 ```
@ -1732,16 +1733,16 @@ controls.
 Examples:

 ```mlir {.mlir}
-// Scalar "signed less than" comparison.
+// Custom form of scalar "signed less than" comparison.
 %x = cmpi "slt", %lhs, %rhs : i32

-// Long-hand notation of the same operation.
+// Generic form of the same operation.
 %x = "cmpi"(%lhs, %rhs){predicate: 2} : (i32, i32) -> i1

-// Vector equality comparison.
+// Custom form of vector equality comparison.
 %x = cmpi "eq", %lhs, %rhs : vector<4xi64>

-// Long-hand notation of the same operation.
+// Generic form of the same operation.
 %x = "cmpi"(%lhs, %rhs){predicate: 0}
    : (vector<4xi64>, vector<4xi64> -> vector<4xi1>
 ```
@ -1770,10 +1771,10 @@ tensor operands, the comparison is performed elementwise and the element of the
 result indicates whether the comparison is true for the operand elements with
 the same indices as those of the result.

-Note: while the short-hand notation uses strings, the actual underlying
+Note: while the custom assembly form uses strings, the actual underlying
 attribute has integer type (or rather enum class in C++ code) as seen from the
-long-hand notation. String literals are used to improve readability of the IR by
-humans.
+generic assembly form. String literals are used to improve readability of the IR
+by humans.

 This operation only applies to integer-like operands, but not floats. The main
 reason being that comparison operations have diverging sets of attributes:
@ -1813,7 +1814,7 @@ Examples:
 // Reference to function @myfn.
 %3 = constant @myfn : (tensor<16xf32>, f32) -> tensor<16xf32>

-// Equivalent longhand forms
+// Equivalent generic forms
 %1 = "constant"(){value: 42} : i32
 %3 = "constant"(){value: @myfn}
   : () -> (tensor<16xf32>, f32) -> tensor<16xf32>
@ -2014,10 +2015,10 @@ operation ::= ssa-id `=` `select` ssa-use, ssa-use, ssa-use `:` type
 Examples:

 ```mlir {.mlir}
-// Short-hand notation of scalar selection.
+// Custom form of scalar selection.
 %x = select %cond, %true, %false : i32

-// Long-hand notation of the same operation.
+// Generic form of the same operation.
 %x = "select"(%cond, %true, %false) : (i1, i32, i32) -> i32

 // Vector selection is element-wise
--- a/mlir/g3doc/OpDefinitions.md
+++ b/mlir/g3doc/OpDefinitions.md
@ -65,10 +65,10 @@ requirements that were desirable:
 *   Behavior of the op is documented along with the op with a summary and a
    description. The description is written in markdown and extracted for
    inclusion in the generated LangRef section of the dialect.
-*   The verbose form of printing and parsing is available as normal, but a
-    custom parser and printer can either be specified or automatically generated
-    from an optional string representation showing the mapping of the "assembly"
-    string to operands/type.
+*   The generic assembly form of printing and parsing is available as normal,
+    but a custom parser and printer can either be specified or automatically
+    generated from an optional string representation showing the mapping of the
+    "assembly" string to operands/type.
    *   Parser-level remappings (e.g., `eq` to enum) will be supported as part
        of the parser generation.
 *   Matching patterns are specified separately from the op description.
@ -189,7 +189,7 @@ Operation definitions consists of:

 1.  Custom printer method.

-    The custom printer to invoke when producing the short form output.
+    The custom printer to invoke when producing the custom assembly form output.

 1.  Custom verifier code.

@ -211,9 +211,10 @@ just being able to specify custom printer/parser methods are sufficient. This
 should presumably be influenced by the design of the assembler/disassembler
 logic that LLVM backends get for free for machine instructions.

-The short form/custom emitter form of the operation is specified using a string
-with matching operation name, operands and attributes. With the ability to
-express additional information that needs to be parsed to build the operation:
+The custom assembly form emitter form of the operation is specified using a
+string with matching operation name, operands and attributes. With the ability
+to express additional information that needs to be parsed to build the
+operation:

 ```tablegen
 tfl.Add $lhs, $rhs {fused_activation_function:
@ -235,9 +236,9 @@ tfl.Add $lhs, $rhs {fused_activation_function:
    E.g., attribute axis is matched with `$axis`. Custom parsing for attribute
    type can be defined along with the attribute definition.

-1.  The information in the short form should be sufficient to invoke the builder
-    generated. That may require being able to propagate information (e.g., the
-    `$lhs` has the same type as the result).
+1.  The information in the custom assembly form should be sufficient to invoke
+    the builder generated. That may require being able to propagate information
+    (e.g., the `$lhs` has the same type as the result).

 Printing is effectively the inverse of the parsing function generated with the
 mnemonic string serving as a template.
--- a/mlir/g3doc/Rationale.md
+++ b/mlir/g3doc/Rationale.md
@ -328,11 +328,11 @@ number of "reserved" names used by standard operations as well as the size of
 the C++ API while their implementations would have been mostly identical.

 The comparison kind is internally an integer attribute. However, for the sake of
-readability by humans, short-hand notation accepts string literals that are
+readability by humans, custom assembly form accepts string literals that are
 mapped to the underlying integer values: `cmpi "eq", %lhs, %rhs` better implies
 integer equality comparison than `cmpi 0, %lhs, %rhs` where it is unclear what
 gets compared to what else. This syntactic sugar is possible thanks to parser
-logic redefinitions for short-hand notation of non-builtin operations.
+logic redefinitions for custom assembly form of non-builtin operations.
 Supporting it in the full notation would have required changing how the main
 parsing algorithm works and may have unexpected repercussions. While it had been
 possible to store the predicate as string attribute, it would have rendered
@ -434,18 +434,18 @@ understand. When types of a dialect are:
 Following the separation between the built-in and standard dialect, it makes
 sense to separate built-in types and standard dialect types. Built-in types are
 required for the validity of the IR itself, e.g. the function type (which
-appears in function signatures and long-hand forms of operations). Integer,
-float, vector, memref and tensor types, while important, are not necessary for
-IR validity.
+appears in function signatures and generic assembly forms of operations).
+Integer, float, vector, memref and tensor types, while important, are not
+necessary for IR validity.

 #### Unregistered types {#unregistered-types}

-MLIR supports unregistered operations in verbose notation. MLIR also supports a
-similar concept for types. When parsing, if the dialect for dialect type has not
-been registered the type is modeled as an 'UnknownType'. This allows for types
-to be round-tripped without needing to link in the dialect library that defined
-them. No additional information about unknown types, outside of
-parsing/printing, will be available.
+MLIR supports unregistered operations in generic assembly form. MLIR also
+supports a similar concept for types. When parsing, if the dialect for dialect
+type has not been registered the type is modeled as an 'UnknownType'. This
+allows for types to be round-tripped without needing to link in the dialect
+library that defined them. No additional information about unknown types,
+outside of parsing/printing, will be available.

 #### Dialect type syntax

@ -487,6 +487,26 @@ to think of these types as existing within the namespace of the dialect. If a
 dialect wishes to assign a canonical name to a type, it can be done via
 [type aliases](LangRef.md#type-aliases).

+### Assembly forms
+
+MLIR decides to support both generic and custom assembly forms under the
+following considerations:
+
+MLIR is an open system; it is designed to support modular and pluggable
+dialects. Depending on whether there exists a corresponding dialect and whether
+the dialect is plugged in, operations may or may not be registered into MLIR
+system. Yet we still need a way to investigate these operations. So the generic
+assembly form is mandated by this aspect of MLIR system. It provides a default
+textual form for operations.
+
+On the other hand, an assembly form is for assisting developers to investigate
+the IR. The generic form serves as a safe fallback but it can be too verbose for
+certain ops. Therefore, MLIR gives each dialect the choice to define a custom
+assembly form for each operation according to the operation's semantics and
+specific needs. The custom assembly form can de-duplicate information from the
+operation to derive a more concise form, thus better facilitating the
+comprehension of the IR.
+
 ## Examples {#examples}

 This section describes a few very simple examples that help understand how MLIR
--- a/mlir/include/mlir/IR/OpDefinition.h
+++ b/mlir/include/mlir/IR/OpDefinition.h
@ -196,13 +196,13 @@ protected:
  /// back to this one which accepts everything.
  bool verify() const { return false; }

-  /// Unless overridden, the short form of an op is always rejected.  Op
-  /// implementations should implement this to return boolean true on failure.
+  /// Unless overridden, the custom assembly form of an op is always rejected.
+  /// Op implementations should implement this to return true on failure.
  /// On success, they should return false and fill in result with the fields to
  /// use.
  static bool parse(OpAsmParser *parser, OperationState *result);

-  // The fallback for the printer is to print it the longhand form.
+  // The fallback for the printer is to print it the generic assembly form.
  void print(OpAsmPrinter *p) const;

  /// Mutability management is handled by the OpWrapper/OpConstWrapper classes,
@ -895,8 +895,9 @@ namespace impl {
 void buildBinaryOp(Builder *builder, OperationState *result, Value *lhs,
                   Value *rhs);
 bool parseBinaryOp(OpAsmParser *parser, OperationState *result);
-// Prints the given binary `op` in short-hand notion if both the two operands
-// and the result have the same time. Otherwise, prints the long-hand notion.
+// Prints the given binary `op` in custom assembly form if both the two operands
+// and the result have the same time. Otherwise, prints the generic assembly
+// form.
 void printBinaryOp(const OperationInst *op, OpAsmPrinter *p);
 } // namespace impl

--- a/mlir/include/mlir/IR/OpImplementation.h
+++ b/mlir/include/mlir/IR/OpImplementation.h
@ -86,8 +86,8 @@ public:
  printOptionalAttrDict(ArrayRef<NamedAttribute> attrs,
                        ArrayRef<const char *> elidedAttrs = {}) = 0;

-  /// Print the entire operation with the default verbose formatting.
-  virtual void printDefaultOp(const OperationInst *op) = 0;
+  /// Print the entire operation with the default generic assembly form.
+  virtual void printGenericOp(const OperationInst *op) = 0;

 private:
  OpAsmPrinter(const OpAsmPrinter &) = delete;
--- a/mlir/include/mlir/LLVMIR/llvm_ops.td
+++ b/mlir/include/mlir/LLVMIR/llvm_ops.td
@ -44,8 +44,8 @@ class LLVM_Op<string mnemonic, list<OpProperty> props = [],
  let parser =
      [{ llvm_unreachable("custom parsing triggered instead of default"); }];

-  // Just use the verbose form.
-  let printer = [{ p->printDefaultOp(this->getInstruction()); }];
+  // Just use the generic assembly form.
+  let printer = [{ p->printGenericOp(this->getInstruction()); }];
 }

 // Base class for LLVM operations with one result.
--- a/mlir/include/mlir/StandardOps/StandardOps.h
+++ b/mlir/include/mlir/StandardOps/StandardOps.h
@ -170,9 +170,9 @@ enum class CmpIPredicate {
 /// Since integers are signless, the predicate also explicitly indicates
 /// whether to interpret the operands as signed or unsigned integers for
 /// less/greater than comparisons.  For the sake of readability by humans,
-/// short-hand syntax for the instruction uses a string-typed attribute for the
-/// predicate.  The value of this attribute corresponds to lower-cased name of
-/// the predicate constant, e.g., "slt" means "signed less than".  The string
+/// custom assembly form for the instruction uses a string-typed attribute for
+/// the predicate.  The value of this attribute corresponds to lower-cased name
+/// of the predicate constant, e.g., "slt" means "signed less than".  The string
 /// representation of the attribute is merely a syntactic sugar and is converted
 /// to an integer attribute by the parser.
 ///
--- a/mlir/include/mlir/StandardOps/standard_ops.td
+++ b/mlir/include/mlir/StandardOps/standard_ops.td
@ -61,7 +61,7 @@ class ArithmeticOp<string mnemonic, list<OpProperty> props = [],
 // tensors thereof.  This operation takes two operands and returns one result,
 // each of these is required to be of the same type.  This type may be an
 // integer scalar type, a vector whose element type is an integer type, or an
-// integer tensor.  The short-hand syntax of the operaton is as follows
+// integer tensor.  The custom assembly form of the operaton is as follows
 //
 //     <op>i %0, %1 : i32
 class IntArithmeticOp<string mnemonic, list<OpProperty> props = [],
@ -73,8 +73,8 @@ class IntArithmeticOp<string mnemonic, list<OpProperty> props = [],
 // tensors thereof.  This operation has two operands and returns one result,
 // each of these is required to be of the same type.  This type may be a
 // floating point scalar type, a vector whose element type is a floating point
-// type, or a floating point tensor.  The short-hand syntax of the operation is
-// as follows
+// type, or a floating point tensor.  The custom assembly form of the operation
+// is as follows
 //
 //     <op>f %0, %1 : f32
 class FloatArithmeticOp<string mnemonic, list<OpProperty> props = [],
--- a/mlir/lib/IR/AsmPrinter.cpp
+++ b/mlir/lib/IR/AsmPrinter.cpp
@ -117,8 +117,8 @@ private:

  void recordTypeReference(Type ty) { usedTypes.insert(ty); }

-  // Return true if this map could be printed using the shorthand form.
-  static bool hasShorthandForm(AffineMap boundMap) {
+  // Return true if this map could be printed using the custom assembly form.
+  static bool hasCustomForm(AffineMap boundMap) {
    if (boundMap.isSingleConstant())
      return true;

@ -190,11 +190,11 @@ void ModuleState::visitIfInst(const IfInst *ifInst) {

 void ModuleState::visitForInst(const ForInst *forInst) {
  AffineMap lbMap = forInst->getLowerBoundMap();
-  if (!hasShorthandForm(lbMap))
+  if (!hasCustomForm(lbMap))
    recordAffineMapReference(lbMap);

  AffineMap ubMap = forInst->getUpperBoundMap();
-  if (!hasShorthandForm(ubMap))
+  if (!hasCustomForm(ubMap))
    recordAffineMapReference(ubMap);
 }

@ -985,7 +985,7 @@ public:
  void print(const Block *block);

  void printOperation(const OperationInst *op);
-  void printDefaultOp(const OperationInst *op);
+  void printGenericOp(const OperationInst *op);

  // Implement OpAsmPrinter.
  raw_ostream &getStream() const { return os; }
@ -1415,11 +1415,11 @@ void FunctionPrinter::printOperation(const OperationInst *op) {
    return;
  }

-  // Otherwise use the standard verbose printing approach.
-  printDefaultOp(op);
+  // Otherwise print with the generic assembly form.
+  printGenericOp(op);
 }

-void FunctionPrinter::printDefaultOp(const OperationInst *op) {
+void FunctionPrinter::printGenericOp(const OperationInst *op) {
  os << '"';
  printEscapedString(op->getName().getStringRef(), os);
  os << "\"(";
@ -1507,11 +1507,11 @@ void FunctionPrinter::printDimAndSymbolList(ArrayRef<InstOperand> ops,
 void FunctionPrinter::printBound(AffineBound bound, const char *prefix) {
  AffineMap map = bound.getMap();

-  // Check if this bound should be printed using short-hand notation.
-  // The decision to restrict printing short-hand notation to trivial cases
+  // Check if this bound should be printed using custom assembly form.
+  // The decision to restrict printing custom assembly form to trivial cases
  // comes from the will to roundtrip MLIR binary -> text -> binary in a
  // lossless way.
-  // Therefore, short-hand parsing and printing is only supported for
+  // Therefore, custom assembly form parsing and printing is only supported for
  // zero-operand constant maps and single symbol operand identity maps.
  if (map.getNumResults() == 1) {
    AffineExpr expr = map.getResult(0);
--- a/mlir/lib/IR/Operation.cpp
+++ b/mlir/lib/IR/Operation.cpp
@ -55,14 +55,14 @@ OpAsmParser::~OpAsmParser() {}
 // OpState trait class.
 //===----------------------------------------------------------------------===//

-// The fallback for the parser is to reject the short form.
+// The fallback for the parser is to reject the custom assembly form.
 bool OpState::parse(OpAsmParser *parser, OperationState *result) {
-  return parser->emitError(parser->getNameLoc(), "has no concise form");
+  return parser->emitError(parser->getNameLoc(), "has no custom assembly form");
 }

-// The fallback for the printer is to print it the longhand form.
+// The fallback for the printer is to print in the generic assembly form.
 void OpState::print(OpAsmPrinter *p) const {
-  p->printDefaultOp(getInstruction());
+  p->printGenericOp(getInstruction());
 }

 /// Emit an error about fatal conditions with this operation, reporting up to
@ -349,11 +349,11 @@ void impl::printBinaryOp(const OperationInst *op, OpAsmPrinter *p) {
  assert(op->getNumResults() == 1 && "binary op should have one result");

  // If not all the operand and result types are the same, just use the
-  // canonical form to avoid omitting information in printing.
+  // generic assembly form to avoid omitting information in printing.
  auto resultType = op->getResult(0)->getType();
  if (op->getOperand(0)->getType() != resultType ||
      op->getOperand(1)->getType() != resultType) {
-    p->printDefaultOp(op);
+    p->printGenericOp(op);
    return;
  }

--- a/mlir/lib/Parser/Parser.cpp
+++ b/mlir/lib/Parser/Parser.cpp
@ -1993,7 +1993,7 @@ public:

  // Operations
  ParseResult parseOperation();
-  OperationInst *parseVerboseOperation();
+  OperationInst *parseGenericOperation();
  OperationInst *parseCustomOperation();

  ParseResult parseForInst();
@ -2505,7 +2505,7 @@ ParseResult FunctionParser::parseOperation() {
  if (getToken().is(Token::bare_identifier) || getToken().isKeyword())
    op = parseCustomOperation();
  else if (getToken().is(Token::string))
-    op = parseVerboseOperation();
+    op = parseGenericOperation();
  else
    return emitError("expected operation name in quotes");

@ -2537,7 +2537,7 @@ ParseResult FunctionParser::parseOperation() {
  return ParseSuccess;
 }

-OperationInst *FunctionParser::parseVerboseOperation() {
+OperationInst *FunctionParser::parseGenericOperation() {

  // Get location information for the operation.
  auto srcLocation = getEncodedSourceLocation(getToken().getLoc());
@ -3055,7 +3055,7 @@ ParseResult FunctionParser::parseBound(SmallVectorImpl<Value *> &operands,
    return ParseSuccess;
  }

-  // Parse shorthand form.
+  // Parse custom assembly form.
  if (getToken().isAny(Token::minus, Token::integer)) {
    int64_t val;
    if (!parseIntConstant(val)) {