Cleanup SuperVectorization dialect printing and parsing.

On the read side, ``` %3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: (d0, d1, d2)->(d2, d0)} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32> ``` becomes: ``` %3 = vector_transfer_read %arg0[%i2, %i1, %i0] {permutation_map: (d0, d1, d2)->(d2, d0)} : memref<?x?x?xf32>, vector<32x256xf32> ``` On the write side, ``` vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index ``` becomes ``` vector_transfer_write %0, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32> ``` Documentation will be cleaned up in a followup commit that also extracts a proper .md from the top of the file comments. PiperOrigin-RevId: 241021879
2019-03-29 11:48:20 -07:00 · 2019-03-29 11:48:20 -07:00 · c9d5f3418a
parent a38792f7d1
commit c9d5f3418a
32 changed files with 467 additions and 523 deletions
--- a/mlir/g3doc/Dialects/Affine.md
+++ b/mlir/g3doc/Dialects/Affine.md
@ -11,7 +11,7 @@ identifiers to enable powerful analysis and transformation. A symbolic
 identifier can be bound to an SSA value that is either an argument to the
 function, a value defined at the top level of that function (outside of all
 loops and if instructions), the result of a
-[`constant` operation](LangRef.md#'constant'-operation), or the result of an
+[`constant` operation](../LangRef.md#'constant'-operation), or the result of an
 [`affine.apply` operation](#'affine.apply'-operation) that recursively takes as
 arguments any symbolic identifiers. Dimensions may be bound not only to anything
 that a symbol is bound to, but also to induction variables of enclosing
@ -30,7 +30,7 @@ operation ::= ssa-id `=` `affine.apply` affine-map dim-and-symbol-use-list
 ```

 The `affine.apply` instruction applies an
-[affine mapping](LangRef.md#affine-expressions) to a list of SSA values,
+[affine mapping](../LangRef.md#affine-expressions) to a list of SSA values,
 yielding a single SSA value. The number of dimension and symbol arguments to
 affine.apply must be equal to the respective number of dimensional and symbolic
 inputs to the affine mapping; the `affine.apply` instruction always returns one
@ -64,7 +64,7 @@ The `affine.for` operation represents an affine loop nest. It has one region
 containing its body. This region must contain one block that terminates with
 [`affine.terminator`](#'affine.terminator"-operation). *Note:* when `affine.for`
 is printed in custom format, the terminator is omitted. The block has one
-argument of [`index`](LangRef.md#index-type) type that represents the induction
+argument of [`index`](../LangRef.md#index-type) type that represents the induction
 variable of the loop.

 The `affine.for` operation executes its body a number of times iterating from a
@ -122,7 +122,7 @@ space defined by an integer set (a conjunction of affine constraints). A single
 `affine.if` may end with an optional `else` clause.

 The condition of the `affine.if` is represented by an
-[integer set](LangRef.md#integer-sets) (a conjunction of affine constraints),
+[integer set](../LangRef.md#integer-sets) (a conjunction of affine constraints),
 and the SSA values bound to the dimensions and symbols in the integer set. The
 [same restrictions](#restrictions-on-dimensions-and-symbols) hold for these SSA
 values as for all bindings of SSA values to dimensions and symbols.
@ -167,7 +167,7 @@ loops ([`for`](#'for'-operation)) and branches ([`if`](#'if'-operation)). It
 unconditionally transmits the control flow to the successor of the operation
 enclosing the region.

-*Rationale*: bodies of affine operations are [blocks](LangRef.md#block) that
+*Rationale*: bodies of affine operations are [blocks](../LangRef.md#block) that
 must have terminators. Loops and branches represent structured control flow and
 should not accept arbitrary branches as terminators.

--- a/mlir/g3doc/Dialects/SuperVector.md
+++ b/mlir/g3doc/Dialects/SuperVector.md
@ -1,4 +1,4 @@
-# SuperVector Dialect
+# Vector Dialect

 This dialect provides mid-level abstraction for the MLIR super-vectorizer.

@ -8,12 +8,12 @@ This dialect provides mid-level abstraction for the MLIR super-vectorizer.

 ### Vector transfers {#vector-transfers}

-#### `vector_transfer_read` operation {#'vector_transfer_read'-operation}
+#### `vector.transfer_read` operation {#'vector.transfer_read'-operation}

 Syntax:

 ``` {.ebnf}
-operation ::= ssa-id `=` `vector_transfer_read` ssa-use-list `{` attribute-entry `} :` function-type
+operation ::= ssa-id `=` `vector.transfer_read` ssa-use-list `{` attribute-entry `} :` function-type
 ```

 Examples:
@ -25,9 +25,9 @@ Examples:
 for %i0 = 0 to %0 {
  affine.for %i1 = 0 to %1 step 256 {
    affine.for %i2 = 0 to %2 step 32 {
-      %v = vector_transfer_read %A, %i0, %i1, %i2, %f0
+      %v = vector.transfer_read %A[%i0, %i1, %i2], (%f0)
           {permutation_map: (d0, d1, d2) -> (d2, d1)} :
-           (memref<?x?x?xf32>, index, index, f32) -> vector<32x256xf32>
+           memref<?x?x?xf32>, vector<32x256xf32>
 }}}

 // Read the slice `%A[%i0, %i1]` (i.e. the element `%A[%i0, %i1]`) into
@ -35,36 +35,36 @@ for %i0 = 0 to %0 {
 // broadcast:
 for %i0 = 0 to %0 {
  affine.for %i1 = 0 to %1 {
-    %3 = vector_transfer_read %A, %i0, %i1
+    %3 = vector.transfer_read %A[%i0, %i1]
         {permutation_map: (d0, d1) -> (0)} :
-         (memref<?x?xf32>, index, index) -> vector<128xf32>
+         memref<?x?xf32>, vector<128xf32>
  }
 }
 ```

-The `vector_transfer_read` performs a blocking read from a slice within a scalar
-[MemRef](#memref-type) supplied as its first operand into a
-[vector](#vector-type) of the same elemental type. The slice is further defined
-by a full-rank index within the MemRef, supplied as the operands `2 .. 1 +
-rank(memref)`. The permutation_map [attribute](#attributes) is an
-[affine-map](#affine-maps) which specifies the transposition on the slice to
-match the vector shape. The size of the slice is specified by the size of the
-vector, given as the return type. Optionally, an `ssa-value` of the same
-elemental type as the MemRef is provided as the last operand to specify padding
-in the case of out-of-bounds accesses. Absence of the optional padding value
-signifies the `vector_transfer_read` is statically guaranteed to remain within
-the MemRef bounds. This operation is called 'read' by opposition to 'load'
-because the super-vector granularity is generally not representable with a
-single hardware register. A `vector_transfer_read` is thus a mid-level
+The `vector.transfer_read` performs a blocking read from a slice within a scalar
+[MemRef](../LangRef.md#memref-type) supplied as its first operand into a
+[vector](../LangRef.md#vector-type) of the same elemental type. The slice is
+further defined by a full-rank index within the MemRef, supplied as the operands
+`2 .. 1 + rank(memref)`. The permutation_map [attribute](../LangRef.md#attributes)
+is an [affine-map](../LangRef.md#affine-maps) which specifies the transposition on
+the slice to match the vector shape. The size of the slice is specified by the
+size of the vector, given as the return type. Optionally, an `ssa-value` of the
+same elemental type as the MemRef is provided as the last operand to specify
+padding in the case of out-of-bounds accesses. Absence of the optional padding
+value signifies the `vector.transfer_read` is statically guaranteed to remain
+within the MemRef bounds. This operation is called 'read' by opposition to
+'load' because the super-vector granularity is generally not representable with
+a single hardware register. A `vector.transfer_read` is thus a mid-level
 abstraction that supports super-vectorization with non-effecting padding for
 full-tile-only code.

 More precisely, let's dive deeper into the permutation_map for the following :

 ```mlir {.mlir}
-vector_transfer_read %A, %expr1, %expr2, %expr3, %expr4
+vector.transfer_read %A[%expr1, %expr2, %expr3, %expr4]
  { permutation_map : (d0,d1,d2,d3) -> (d2,0,d0) } :
-  (memref<?x?x?x?xf32>, index, index, index, index) -> vector<3x4x5xf32>
+  memref<?x?x?x?xf32>, vector<3x4x5xf32>
 ```

 This operation always reads a slice starting at `%A[%expr1, %expr2, %expr3,
@ -74,7 +74,7 @@ This operation always reads a slice starting at `%A[%expr1, %expr2, %expr3,
 That slice needs to be read into a `vector<3x4x5xf32>`. Since the permutation
 map is not full rank, there must be a broadcast along vector dimension `1`.

-A notional lowering of vector_transfer_read could generate code resembling:
+A notional lowering of vector.transfer_read could generate code resembling:

 ```mlir {.mlir}
 // %expr1, %expr2, %expr3, %expr4 defined before this point
@ -117,12 +117,12 @@ the same amount of data as the `3 * 5` values transferred. An additional `1`
 broadcast is required. On a GPU this broadcast could be implemented using a
 warp-shuffle if loop `j` were mapped to `threadIdx.x`.

-#### `vector_transfer_write` operation {#'vector_transfer_write'-operation}
+#### `vector.transfer_write` operation {#'vector.transfer_write'-operation}

 Syntax:

 ``` {.ebnf}
-operation ::= `vector_transfer_write` ssa-use-list `{` attribute-entry `} :` vector-type ', ' memref-type ', ' index-type-list
+operation ::= `vector.transfer_write` ssa-use-list `{` attribute-entry `} :` vector-type ', ' memref-type ', ' index-type-list
 ```

 Examples:
@ -134,45 +134,46 @@ for %i0 = 0 to %0 {
    affine.for %i2 = 0 to %2 step 64 {
      affine.for %i3 = 0 to %3 step 16 {
        %val = `ssa-value` : vector<16x32x64xf32>
-        vector_transfer_write %val, %A, %i0, %i1, %i2, %i3
+        vector.transfer_write %val, %A[%i0, %i1, %i2, %i3]
          {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
-          vector<16x32x64xf32>, memref<?x?x?x?xf32>, index, index, index, index
+          vector<16x32x64xf32>, memref<?x?x?x?xf32>
 }}}}
 ```

-The `vector_transfer_write` performs a blocking write from a
-[vector](#vector-type), supplied as its first operand, into a slice within a
-scalar [MemRef](#memref-type) of the same elemental type, supplied as its second
-operand. The slice is further defined by a full-rank index within the MemRef,
-supplied as the operands `3 .. 2 + rank(memref)`. The permutation_map
-[attribute](#attributes) is an [affine-map](#affine-maps) which specifies the
-transposition on the slice to match the vector shape. The size of the slice is
-specified by the size of the vector. This operation is called 'write' by
-opposition to 'store' because the super-vector granularity is generally not
-representable with a single hardware register. A `vector_transfer_write` is thus
-a mid-level abstraction that supports super-vectorization with non-effecting
-padding for full-tile-only code. It is the responsibility of
-`vector_transfer_write`'s implementation to ensure the memory writes are valid.
-Different lowerings may be pertinent depending on the hardware support.
+The `vector.transfer_write` performs a blocking write from a
+[vector](../LangRef.md#vector-type), supplied as its first operand, into a slice
+within a scalar [MemRef](../LangRef.md#memref-type) of the same elemental type,
+supplied as its second operand. The slice is further defined by a full-rank
+index within the MemRef, supplied as the operands `3 .. 2 + rank(memref)`. The
+permutation_map [attribute](../LangRef.md#attributes) is an
+[affine-map](../LangRef.md#affine-maps) which specifies the transposition on the
+slice to match the vector shape. The size of the slice is specified by the size
+of the vector. This operation is called 'write' by opposition to 'store' because
+the super-vector granularity is generally not representable with a single
+hardware register. A `vector.transfer_write` is thus a mid-level abstraction
+that supports super-vectorization with non-effecting padding for full-tile-only
+code. It is the responsibility of `vector.transfer_write`'s implementation to
+ensure the memory writes are valid. Different lowerings may be pertinent
+depending on the hardware support.

 ### Vector views {#vector-views}

-#### `vector_type_cast` operation {#'vector_type_cast'-operation}
+#### `vector.type_cast` operation {#'vector.type_cast'-operation}

 Syntax:

 ``` {.ebnf}
-operation ::= `vector_type_cast` ssa-use : memref-type, memref-type
+operation ::= `vector.type_cast` ssa-use : memref-type, memref-type
 ```

 Examples:

 ```mlir
 %A  = alloc() : memref<5x4x3xf32>
- %VA = vector_type_cast %A : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
+ %VA = vector.type_cast %A : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
 ```

-The `vector_type_cast` operation performs a conversion from a memref with scalar
+The `vector.type_cast` operation performs a conversion from a memref with scalar
 element to memref with a *single* vector element, copying the shape of the
 memref to the vector. This is the minimal viable operation that is required to
 make super-vectorization operational. It can be seen as a special case of the
--- a/mlir/g3doc/LangRef.md
+++ b/mlir/g3doc/LangRef.md
@ -2211,7 +2211,7 @@ a conversion pass.
 Currently, MLIR supports the following dialects:

 *   [Standard dialect](#standard-operations)
-*   [SuperVector dialect](Dialects/SuperVector.md)
+*   [Vector dialect](Dialects/Vector.md)
 *   [TensorFlow dialect](#tensorflow-operations)

 ### TensorFlow operations {#tensorflow-operations}
--- a/mlir/include/mlir/Analysis/VectorAnalysis.h
+++ b/mlir/include/mlir/Analysis/VectorAnalysis.h
@ -89,13 +89,13 @@ shapeRatio(VectorType superVectorType, VectorType subVectorType);
 ///    affine.for %i3 = 0 to %0 step 32 {
 ///      affine.for %i4 = 0 to %1 {
 ///        affine.for %i5 = 0 to %2 step 256 {
-///          %4 = vector_transfer_read %arg0, %i4, %i5, %i3
+///          %4 = vector.transfer_read %arg0, %i4, %i5, %i3
 ///               {permutation_map: (d0, d1, d2) -> (d2, d1)} :
 ///               (memref<?x?x?xf32>, index, index) -> vector<32x256xf32>
 ///    }}}
 /// ```
 ///
-/// Meaning that vector_transfer_read will be responsible for reading the slice:
+/// Meaning that vector.transfer_read will be responsible for reading the slice:
 /// `%arg0[%i4, %i5:%15+256, %i3:%i3+32]` into vector<32x256xf32>.
 ///
 /// Example 2:
@ -112,13 +112,13 @@ shapeRatio(VectorType superVectorType, VectorType subVectorType);
 ///
 /// ```mlir
 ///    affine.for %i0 = 0 to %0 step 128 {
-///      %3 = vector_transfer_read %arg0, %c0_0, %c0_0
+///      %3 = vector.transfer_read %arg0, %c0_0, %c0_0
 ///           {permutation_map: (d0, d1) -> (0)} :
 ///           (memref<?x?xf32>, index, index) -> vector<128xf32>
 ///    }
 /// ````
 ///
-/// Meaning that vector_transfer_read will be responsible of reading the slice
+/// Meaning that vector.transfer_read will be responsible of reading the slice
 /// `%arg0[%c0, %c0]` into vector<128xf32> which needs a 1-D vector broadcast.
 ///
 AffineMap makePermutationMap(
@ -127,7 +127,7 @@ AffineMap makePermutationMap(

 namespace matcher {

-/// Matches vector_transfer_read, vector_transfer_write and ops that return a
+/// Matches vector.transfer_read, vector.transfer_write and ops that return a
 /// vector type that is a multiple of the sub-vector type. This allows passing
 /// over other smaller vector types in the function and avoids interfering with
 /// operations on those.
@ -135,7 +135,7 @@ namespace matcher {
 /// TODO(ntv): this could all be much simpler if we added a bit that a vector
 /// type to mark that a vector is a strict super-vector but it still does not
 /// warrant adding even 1 extra bit in the IR for now.
-bool operatesOnSuperVectors(Operation &op, VectorType subVectorType);
+bool operatesOnSuperVectorsOf(Operation &op, VectorType subVectorType);

 } // end namespace matcher
 } // end namespace mlir
--- a/mlir/include/mlir/EDSC/Builders.h
+++ b/mlir/include/mlir/EDSC/Builders.h
@ -26,7 +26,7 @@
 #include "mlir/AffineOps/AffineOps.h"
 #include "mlir/IR/Builders.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
+#include "mlir/VectorOps/VectorOps.h"

 namespace mlir {

--- a/mlir/include/mlir/EDSC/Types.h
+++ b/mlir/include/mlir/EDSC/Types.h
@ -325,7 +325,7 @@ protected:
 /// ```mlir
 ///    Stmt scalarValue, vectorValue, tmpAlloc, tmpDealloc, vectorView;
 ///    tmpAlloc = alloc(tmpMemRefType);
-///    vectorView = vector_type_cast(tmpAlloc, vectorMemRefType),
+///    vectorView = vector.type_cast(tmpAlloc, vectorMemRefType),
 ///    vectorValue = load(vectorView, zero),
 ///    tmpDealloc = dealloc(tmpAlloc)});
 ///    emitter.emitStmts({tmpAlloc, vectorView, vectorValue, tmpDealloc});
@ -391,7 +391,7 @@ protected:
 ///    Stmt scalarValue, vectorValue, tmpAlloc, tmpDealloc, vectorView;
 ///    Stmt block = Block({
 ///      tmpAlloc = alloc(tmpMemRefType),
-///      vectorView = vector_type_cast(tmpAlloc, vectorMemRefType),
+///      vectorView = vector.type_cast(tmpAlloc, vectorMemRefType),
 ///      For(ivs, lbs, ubs, steps, {
 ///        scalarValue = load(scalarMemRef,
 ///        accessInfo.clippedScalarAccessExprs), store(scalarValue, tmpAlloc,
--- a/mlir/include/mlir/SuperVectorOps/SuperVectorOps.h
+++ b/mlir/include/mlir/SuperVectorOps/SuperVectorOps.h
@ -1,4 +1,4 @@
-//===- SuperVectorOps.h - MLIR Super Vectorizer Operations ------*- C++ -*-===//
+//===- VectorOps.h - MLIR Super Vectorizer Operations -----------*- C++ -*-===//
 //
 // Copyright 2019 The MLIR Authors.
 //
@ -20,8 +20,8 @@
 //
 //===----------------------------------------------------------------------===//

-#ifndef MLIR_INCLUDE_MLIR_SUPERVECTOROPS_SUPERVECTOROPS_H
-#define MLIR_INCLUDE_MLIR_SUPERVECTOROPS_SUPERVECTOROPS_H
+#ifndef MLIR_VECTOROPS_VECTOROPS_H
+#define MLIR_VECTOROPS_VECTOROPS_H

 #include "mlir/IR/Attributes.h"
 #include "mlir/IR/Dialect.h"
@ -31,9 +31,9 @@
 namespace mlir {

 /// Dialect for super-vectorization Ops.
-class SuperVectorOpsDialect : public Dialect {
+class VectorOpsDialect : public Dialect {
 public:
-  SuperVectorOpsDialect(MLIRContext *context);
+  VectorOpsDialect(MLIRContext *context);
 };

 /// VectorTransferReadOp performs a blocking read from a scalar memref
@ -70,14 +70,12 @@ public:
 ///   ...
 ///   %val = `ssa-value` : f32
 ///   // let %i, %j, %k, %l be ssa-values of type index
-///   %v0 = vector_transfer_read %src, %i, %j, %k, %l
+///   %v0 = vector.transfer_read %src[%i, %j, %k, %l]
 ///          {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
-///        (memref<?x?x?x?xf32>, index, index, index, index) ->
-///          vector<16x32x64xf32>
-///   %v1 = vector_transfer_read %src, %i, %j, %k, %l, %val
+///         memref<?x?x?x?xf32>, vector<16x32x64xf32>
+///   %v1 = vector.transfer_read %src[%i, %j, %k, %l], (%val)
 ///          {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
-///        (memref<?x?x?x?xf32>, index, index, index, index, f32) ->
-///           vector<16x32x64xf32>
+///         memref<?x?x?x?xf32>, vector<16x32x64xf32>
 /// ```
 ///
 /// Example with partial rank permutation_map:
@ -86,10 +84,9 @@ public:
 ///   %A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>
 ///   ...
 ///   // let %i, %j be ssa-values of type index
-///   %v0 = vector_transfer_read %src, %i, %c0, %c0, %c0
+///   %v0 = vector.transfer_read %src[%i, %c0, %c0, %c0]
 ///          {permutation_map: (d0, d1, d2, d3) -> (0, d1, 0)} :
-///        (memref<?x?x?x?xf32>, index, index, index, index) ->
-///          vector<16x32x64xf32>
+///         memref<?x?x?x?xf32>, vector<16x32x64xf32>
 class VectorTransferReadOp
    : public Op<VectorTransferReadOp, OpTrait::VariadicOperands,
                OpTrait::OneResult> {
@ -98,7 +95,7 @@ class VectorTransferReadOp
 public:
  using Op::Op;

-  static StringRef getOperationName() { return "vector_transfer_read"; }
+  static StringRef getOperationName() { return "vector.transfer_read"; }
  static StringRef getPermutationMapAttrName() { return "permutation_map"; }
  static void build(Builder *builder, OperationState *result,
                    VectorType vectorType, Value *srcMemRef,
@ -133,7 +130,7 @@ public:
 ///
 /// A vector transfer write has semantics similar to a vector store, with
 /// additional support for handling out-of-bounds situations. It is the
-/// responsibility of vector_transfer_write's implementation to ensure the
+/// responsibility of vector.transfer_write's implementation to ensure the
 /// memory writes are valid. Different implementations may be pertinent
 /// depending on the hardware support including:
 /// 1. predication;
@ -146,9 +143,9 @@ public:
 ///   %A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>.
 ///   %val = `ssa-value` : vector<16x32x64xf32>
 ///   // let %i, %j, %k, %l be ssa-values of type index
-///   vector_transfer_write %val, %src, %i, %j, %k, %l
+///   vector.transfer_write %val, %src[%i, %j, %k, %l]
 ///     {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
-///   vector<16x32x64xf32>, memref<?x?x?x?xf32>, index, index, index, index
+///   vector<16x32x64xf32>, memref<?x?x?x?xf32>
 /// ```
 class VectorTransferWriteOp
    : public Op<VectorTransferWriteOp, OpTrait::VariadicOperands,
@ -162,7 +159,7 @@ class VectorTransferWriteOp
 public:
  using Op::Op;

-  static StringRef getOperationName() { return "vector_transfer_write"; }
+  static StringRef getOperationName() { return "vector.transfer_write"; }
  static StringRef getPermutationMapAttrName() { return "permutation_map"; }
  static void build(Builder *builder, OperationState *result, Value *srcVector,
                    Value *dstMemRef, ArrayRef<Value *> dstIndices,
@ -190,14 +187,14 @@ public:
 ///
 /// ```mlir
 ///  %A  = alloc() : memref<5x4x3xf32>
-///  %VA = vector_type_cast %A : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
+///  %VA = vector.type_cast %A : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
 /// ```
 class VectorTypeCastOp
    : public Op<VectorTypeCastOp, OpTrait::OneOperand, OpTrait::OneResult> {
 public:
  using Op::Op;

-  static StringRef getOperationName() { return "vector_type_cast"; }
+  static StringRef getOperationName() { return "vector.type_cast"; }
  static void build(Builder *builder, OperationState *result, Value *srcVector,
                    Type dstType);
  static bool parse(OpAsmParser *parser, OperationState *result);
@ -207,4 +204,4 @@ public:

 } // end namespace mlir

-#endif // MLIR_INCLUDE_MLIR_SUPERVECTOROPS_SUPERVECTOROPS_H
+#endif // MLIR_VECTOROPS_VECTOROPS_H
--- a/mlir/lib/Analysis/LoopAnalysis.cpp
+++ b/mlir/lib/Analysis/LoopAnalysis.cpp
@ -30,9 +30,9 @@
 #include "mlir/IR/Builders.h"
 #include "mlir/IR/Operation.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/Functional.h"
 #include "mlir/Support/MathExtras.h"
+#include "mlir/VectorOps/VectorOps.h"

 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/SmallString.h"
--- a/mlir/lib/Analysis/VectorAnalysis.cpp
+++ b/mlir/lib/Analysis/VectorAnalysis.cpp
@ -22,9 +22,9 @@
 #include "mlir/IR/IntegerSet.h"
 #include "mlir/IR/Operation.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/Functional.h"
 #include "mlir/Support/STLExtras.h"
+#include "mlir/VectorOps/VectorOps.h"

 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/SetVector.h"
@ -185,11 +185,11 @@ AffineMap mlir::makePermutationMap(
  return ::makePermutationMap(store.getIndices(), enclosingLoopToVectorDim);
 }

-bool mlir::matcher::operatesOnSuperVectors(Operation &op,
-                                           VectorType subVectorType) {
+bool mlir::matcher::operatesOnSuperVectorsOf(Operation &op,
+                                             VectorType subVectorType) {
  // First, extract the vector type and ditinguish between:
-  //   a. ops that *must* lower a super-vector (i.e. vector_transfer_read,
-  //      vector_transfer_write); and
+  //   a. ops that *must* lower a super-vector (i.e. vector.transfer_read,
+  //      vector.transfer_write); and
  //   b. ops that *may* lower a super-vector (all other ops).
  // The ops that *may* lower a super-vector only do so if the super-vector to
  // sub-vector ratio exists. The ops that *must* lower a super-vector are
@ -218,7 +218,7 @@ bool mlir::matcher::operatesOnSuperVectors(Operation &op,
      return false;
    }
  } else {
-    // Not a vector_transfer and has more than 1 result, fail hard for now to
+    // Not a vector.transfer and has more than 1 result, fail hard for now to
    // wake us up when something changes.
    op.emitError("NYI: operation has more than 1 result");
    return false;
@ -229,7 +229,7 @@ bool mlir::matcher::operatesOnSuperVectors(Operation &op,

  // Sanity check.
  assert((ratio.hasValue() || !mustDivide) &&
-         "vector_transfer operation in which super-vector size is not an"
+         "vector.transfer operation in which super-vector size is not an"
         " integer multiple of sub-vector size");

  // This catches cases that are not strictly necessary to have multiplicity but
--- a/mlir/lib/EDSC/Intrinsics.cpp
+++ b/mlir/lib/EDSC/Intrinsics.cpp
@ -18,7 +18,7 @@
 #include "mlir/EDSC/Intrinsics.h"
 #include "mlir/EDSC/Builders.h"
 #include "mlir/IR/AffineExpr.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
+#include "mlir/VectorOps/VectorOps.h"

 using namespace mlir;
 using namespace mlir::edsc;
--- a/mlir/lib/EDSC/MLIREmitter.cpp
+++ b/mlir/lib/EDSC/MLIREmitter.cpp
@ -32,9 +32,9 @@
 #include "mlir/IR/Operation.h"
 #include "mlir/IR/Value.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/Functional.h"
 #include "mlir/Support/STLExtras.h"
+#include "mlir/VectorOps/VectorOps.h"

 using llvm::dbgs;
 using llvm::errs;
--- a/mlir/lib/EDSC/Types.cpp
+++ b/mlir/lib/EDSC/Types.cpp
@ -27,8 +27,8 @@
 #include "mlir/IR/OperationSupport.h"
 #include "mlir/IR/StandardTypes.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/STLExtras.h"
+#include "mlir/VectorOps/VectorOps.h"

 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/StringSwitch.h"
--- a/mlir/lib/Transforms/LowerVectorTransfers.cpp
+++ b/mlir/lib/Transforms/LowerVectorTransfers.cpp
@ -38,10 +38,10 @@
 #include "mlir/IR/Types.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/Functional.h"
 #include "mlir/Transforms/MLPatternLoweringPass.h"
 #include "mlir/Transforms/Passes.h"
+#include "mlir/VectorOps/VectorOps.h"

 /// Implements lowering of VectorTransferReadOp and VectorTransferWriteOp to a
 /// proper abstraction for the hardware.
@ -58,9 +58,9 @@
 ///    affine.for %i0 = 0 to %0 {
 ///      affine.for %i1 = 0 to %1 step 256 {
 ///        affine.for %i2 = 0 to %2 step 32 {
-///          %v = vector_transfer_read %A, %i0, %i1, %i2, %f0
+///          %v = vector.transfer_read %A[%i0, %i1, %i2], (%f0)
 ///               {permutation_map: (d0, d1, d2) -> (d2, d1)} :
-///               (memref<?x?x?xf32>, index, index, f32) -> vector<32x256xf32>
+///               memref<?x?x?xf32>, vector<32x256xf32>
 ///    }}}
 /// ```
 ///
--- a/mlir/lib/Transforms/MaterializeVectors.cpp
+++ b/mlir/lib/Transforms/MaterializeVectors.cpp
@ -37,10 +37,10 @@
 #include "mlir/IR/Types.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/Functional.h"
 #include "mlir/Support/LLVM.h"
 #include "mlir/Transforms/Passes.h"
+#include "mlir/VectorOps/VectorOps.h"

 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
@ -54,7 +54,7 @@
 /// a target-independent way: the target vector size is specified as a parameter
 /// to the pass. This pass is thus a partial lowering that opens the "greybox"
 /// that is the super-vector abstraction. In particular, this pass can turn the
-/// vector_transfer_read and vector_transfer_write ops in either:
+/// vector.transfer_read and vector.transfer_write ops in either:
 ///   1. a loop nest with either scalar and vector load/store operations; or
 ///   2. a loop-nest with DmaStartOp / DmaWaitOp; or
 ///   3. a pre-existing blackbox library call that can be written manually or
@ -79,14 +79,14 @@
 /// do not vectorize in the presence of conditionals for now, sliced chains are
 /// guaranteed not to escape the innermost scope, which has to be either the top
 /// Function scope or the innermost loop scope, by construction. As a
-/// consequence, the implementation just starts from vector_transfer_write
+/// consequence, the implementation just starts from vector.transfer_write
 /// operations and builds the slice scoped the innermost loop enclosing the
-/// current vector_transfer_write. These assumptions and the implementation
+/// current vector.transfer_write. These assumptions and the implementation
 /// details are subject to revision in the future.
 ///
 /// Example
 /// ========
-/// In the following, the single vector_transfer_write op operates on a
+/// In the following, the single vector.transfer_write op operates on a
 /// vector<4x4x4xf32>. Let's assume the HW supports vector<4x4xf32>.
 /// Materialization is achieved by instantiating each occurrence of the leading
 /// dimension of vector<4x4x4xf32> into a vector<4x4xf32>.
@ -98,16 +98,15 @@
 ///
 /// ```mlir
 ///    mlfunc @materialize(%M : index, %N : index, %O : index, %P : index) {
-///      %A = alloc (%M, %N, %O, %P) : memref<?x?x?x?xf32, 0>
+///      %A = alloc (%M, %N, %O, %P) : memref<?x?x?x?xf32>
 ///      %f1 = constant splat<vector<4x4x4xf32>, 1.000000e+00> :
 ///      vector<4x4x4xf32> affine.for %i0 = 0 to %M step 4 {
 ///        affine.for %i1 = 0 to %N step 4 {
 ///          affine.for %i2 = 0 to %O {
 ///            affine.for %i3 = 0 to %P step 4 {
-///              vector_transfer_write %f1, %A, %i0, %i1, %i2, %i3
+///              vector.transfer_write %f1, %A[%i0, %i1, %i2, %i3]
 ///                {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} :
-///                 vector<4x4x4xf32>, memref<?x?x?x?xf32, 0>,
-///                 index, index, index, index
+///                 vector<4x4x4xf32>, memref<?x?x?x?xf32>
 ///      }}}}
 ///      return
 ///    }
@ -123,30 +122,21 @@
 ///         affine.for %i1 = 0 to %arg1 step 4 {
 ///           affine.for %i2 = 0 to %arg2 {
 ///             affine.for %i3 = 0 to %arg3 step 4 {
-///               %1 = affine.apply (d0, d1, d2, d3) -> (d0, d1, d2, d3)
-///                    (%i0, %i1, %i2, %i3)
-///               vector_transfer_write f1, %0, %1#0, %1#1, %1#2, %1#3
+///               vector.transfer_write f1, %0[%i0, %i1, %i2, %i3]
 ///                 {permutation_map: (d0, d1, d2, d3) -> (d1, d0)} :
-///                 vector<4x4xf32>, memref<?x?x?x?xf32>,
-///                 index, index, index, index
-///               %2 = affine.apply (d0, d1, d2, d3) -> (d0, d1, d2, d3 + 1)
-///                    (%i0, %i1, %i2, %i3)
-///               vector_transfer_write {{.*}}, %0, %2#0, %2#1, %2#2, %2#3
+///                 vector<4x4xf32>, memref<?x?x?x?xf32>
+///               %i3p1 = affine.apply (d0) -> (d0 + 1)(%i3)
+///               vector.transfer_write {{.*}}, %0[%i0, %i1, %i2, %i3p1]
 ///                 {permutation_map: (d0, d1, d2, d3) -> (d1, d0)} :
-///                 vector<4x4xf32>, memref<?x?x?x?xf32>,
-///                 index, index, index, index
-///               %3 = affine.apply (d0, d1, d2, d3) -> (d0, d1, d2, d3 + 2)
-///                    (%i0, %i1, %i2, %i3)
-///               vector_transfer_write {{.*}}, %0, %3#0, %3#1, %3#2, %3#3
+///                 vector<4x4xf32>, memref<?x?x?x?xf32>
+///               %i3p2 = affine.apply (d0) -> (d0 + 2)(%i3)
+///               vector.transfer_write {{.*}}, %0[%i0, %i1, %i2, %i3p2]
 ///                 {permutation_map: (d0, d1, d2, d3) -> (d1, d0)} :
-///                 vector<4x4xf32>, memref<?x?x?x?xf32>,
-///                 index, index, index, index
-///               %4 = affine.apply (d0, d1, d2, d3) -> (d0, d1, d2, d3 + 3)
-///                    (%i0, %i1, %i2, %i3)
-///               vector_transfer_write {{.*}}, %0, %4#0, %4#1, %4#2, %4#3
+///                 vector<4x4xf32>, memref<?x?x?x?xf32>
+///               %i3p3 = affine.apply (d0) -> (d0 + 3)(%i3)
+///               vector.transfer_write {{.*}}, %0[%i0, %i1, %i2, %i3p3]
 ///                 {permutation_map: (d0, d1, d2, d3) -> (d1, d0)} :
-///                 vector<4x4xf32>, memref<?x?x?x?xf32>,
-///                 index, index, index, index
+///                 vector<4x4xf32>, memref<?x?x?x?xf32>
 ///      }}}}
 ///      return
 ///    }
@ -282,11 +272,11 @@ static Value *substitute(Value *v, VectorType hwVectorType,

 /// Returns a list of single result AffineApplyOps that reindex the
 /// `memRefIndices` by the multi-dimensional `hwVectorInstance`. This is used by
-/// the function that materializes a vector_transfer operation to use hardware
+/// the function that materializes a vector.transfer operation to use hardware
 /// vector types instead of super-vector types.
 ///
 /// The general problem this function solves is as follows:
-/// Assume a vector_transfer operation at the super-vector granularity that has
+/// Assume a vector.transfer operation at the super-vector granularity that has
 /// `l` enclosing loops (AffineForOp). Assume the vector transfer operation
 /// operates on a MemRef of rank `r`, a super-vector of rank `s` and a hardware
 /// vector of rank `h`. For the purpose of illustration assume l==4, r==3, s==2,
@ -299,8 +289,8 @@ static Value *substitute(Value *v, VectorType hwVectorType,
 ///   affine.for %i1 = 0 to %N step 3 {
 ///     affine.for %i2 = 0 to %O {
 ///       affine.for %i3 = 0 to %P step 32 {
-///         %r = vector_transfer_read(%A, map(%i..)#0, map(%i..)#1, map(%i..)#2)
-///                                   -> vector<3x32xf32>
+///         %r = vector.transfer_read(%A, map0(%i..), map1(%i..), map2(%i..)) :
+///              vector<3x32xf32>, memref<?x?x?xf32>
 ///         ...
 /// }}}}
 /// ```
@ -308,28 +298,31 @@ static Value *substitute(Value *v, VectorType hwVectorType,
 /// where map denotes an AffineMap operating on enclosing loops with properties
 /// compatible for vectorization (i.e. some contiguity left unspecified here).
 /// Note that the vectorized loops are %i1 and %i3.
-/// This function translates the vector_transfer_read operation to multiple
-/// instances of vector_transfer_read that operate on vector<8x32>.
+/// This function translates the vector.transfer_read operation to multiple
+/// instances of vector.transfer_read that operate on vector<8x32>.
 ///
 /// Without loss of generality, we assume hwVectorInstance is: {2, 1}.
 /// The only constraints on hwVectorInstance is they belong to:
 ///   [0, 2] x [0, 3], which is the span of ratio of super-vector shape to
 /// hardware vector shape in our example.
 ///
-/// This function instantiates the iteration <2, 1> of vector_transfer_read
+/// This function instantiates the iteration <2, 1> of vector.transfer_read
 /// into the set of operations in pseudo-MLIR:
 ///
 /// ```mlir
-///   map2 = (d0, d1, d2, d3) -> (d0, d1 + 2, d2, d3 + 1 * 8)
-///   map3 = map o map2 // where o denotes composition
-///   %r = vector_transfer_read(%A, map3(%i..)#0, map3(%i..)#1, map3(%i..)#2)
-///                             -> vector<3x32xf32>
+///   #map2 = (d0, d1, d2, d3) -> (d0, d1 + 2, d2, d3 + 1 * 8)
+///   #map3 = #map o #map2 // where o denotes composition
+///   aff0 = affine.apply #map3.0(%i..)
+///   aff1 = affine.apply #map3.1(%i..)
+///   aff2 = affine.apply #map3.2(%i..)
+///   %r = vector.transfer_read(%A, %aff0, %aff1, %aff2):
+//         vector<3x32xf32>, memref<?x?x?xf32>
 /// ```
 ///
 /// Practical considerations
 /// ========================
 /// For now, `map` is assumed to be the identity map and the indices are
-/// specified just as vector_transfer_read(%A, %i0, %i1, %i2, %i3). This will be
+/// specified just as vector.transfer_read%A[%i0, %i1, %i2, %i3]. This will be
 /// extended in the future once we have a proper Op for vector transfers.
 /// Additionally, the example above is specified in pseudo-MLIR form; once we
 /// have proper support for generic maps we can generate the code and show
@ -347,7 +340,7 @@ reindexAffineIndices(FuncBuilder *b, VectorType hwVectorType,

  unsigned numIndices = memrefIndices.size();
  auto numMemRefIndices = numIndices - hwVectorInstance.size();
-  auto numSuperVectorIndices = hwVectorInstance.size() - vectorShape.size();
+  auto numVectorIndices = hwVectorInstance.size() - vectorShape.size();

  SmallVector<AffineExpr, 8> affineExprs;
  // TODO(ntv): support a concrete map and composition.
@ -358,11 +351,10 @@ reindexAffineIndices(FuncBuilder *b, VectorType hwVectorType,
    auto d_i = b->getAffineDimExpr(i);
    affineExprs.push_back(d_i);
  }
-  // The next numSuperVectorIndices correspond to super-vector dimensions that
+  // The next numVectorIndices correspond to super-vector dimensions that
  // do not have a hardware vector dimension counterpart. For those we only
  // need to increment the index by the corresponding hwVectorInstance.
-  for (i = numMemRefIndices; i < numMemRefIndices + numSuperVectorIndices;
-       ++i) {
+  for (i = numMemRefIndices; i < numMemRefIndices + numVectorIndices; ++i) {
    auto d_i = b->getAffineDimExpr(i);
    auto offset = hwVectorInstance[i - numMemRefIndices];
    affineExprs.push_back(d_i + offset);
@ -374,7 +366,7 @@ reindexAffineIndices(FuncBuilder *b, VectorType hwVectorType,
  for (; i < numIndices; ++i) {
    auto d_i = b->getAffineDimExpr(i);
    auto offset = hwVectorInstance[i - numMemRefIndices];
-    auto stride = vectorShape[i - numMemRefIndices - numSuperVectorIndices];
+    auto stride = vectorShape[i - numMemRefIndices - numVectorIndices];
    affineExprs.push_back(d_i + offset * stride);
  }

@ -535,14 +527,14 @@ static Operation *instantiate(FuncBuilder *b, VectorTransferWriteOp write,
 /// The multi-dimensional `hwVectorInstance` belongs to the shapeRatio of
 /// super-vector type to hw vector type.
 /// A cloned instance of `op` is formed as follows:
-///   1. vector_transfer_read: the return `superVectorType` is replaced by
+///   1. vector.transfer_read: the return `superVectorType` is replaced by
 ///      `hwVectorType`. Additionally, affine indices are reindexed with
 ///      `reindexAffineIndices` using `hwVectorInstance` and vector type
 ///      information;
-///   2. vector_transfer_write: the `valueToStore` type is simply substituted.
+///   2. vector.transfer_write: the `valueToStore` type is simply substituted.
 ///      Since we operate on a topologically sorted slice, a substitution must
 ///      have been registered for non-constant ops. Additionally, affine indices
-///      are reindexed in the same way as for vector_transfer_read;
+///      are reindexed in the same way as for vector.transfer_read;
 ///   3. constant ops are splats of the super-vector type by construction.
 ///      They are cloned to a splat on the hw vector type with the same value;
 ///   4. remaining ops are cloned to version of the op that returns a hw vector
@ -660,7 +652,7 @@ static bool emitSlice(MaterializationState *state,
 }

 /// Materializes super-vector types into concrete hw vector types as follows:
-///   1. start from super-vector terminators (current vector_transfer_write
+///   1. start from super-vector terminators (current vector.transfer_write
 ///      ops);
 ///   2. collect all the operations that can be reached by transitive use-defs
 ///      chains;
@ -755,13 +747,13 @@ void MaterializeVectorsPass::runOnFunction() {
  auto subVectorType =
      VectorType::get(hwVectorSize, FloatType::getF32(&getContext()));

-  // Capture terminators; i.e. vector_transfer_write ops involving a strict
+  // Capture terminators; i.e. vector.transfer_write ops involving a strict
  // super-vector of subVectorType.
  auto filter = [subVectorType](Operation &op) {
    if (!op.isa<VectorTransferWriteOp>()) {
      return false;
    }
-    return matcher::operatesOnSuperVectors(op, subVectorType);
+    return matcher::operatesOnSuperVectorsOf(op, subVectorType);
  };
  auto pat = Op(filter);
  SmallVector<NestedMatch, 8> matches;
--- a/mlir/lib/Transforms/Vectorization/VectorizerTestPass.cpp
+++ b/mlir/lib/Transforms/Vectorization/VectorizerTestPass.cpp
@ -108,7 +108,7 @@ void VectorizerTestPass::testVectorShapeRatio(llvm::raw_ostream &outs) {
    assert(subVectorType.getElementType() ==
               FloatType::getF32(subVectorType.getContext()) &&
           "Only f32 supported for now");
-    if (!matcher::operatesOnSuperVectors(op, subVectorType)) {
+    if (!matcher::operatesOnSuperVectorsOf(op, subVectorType)) {
      return false;
    }
    if (op.getNumResults() != 1) {
--- a/mlir/lib/Transforms/Vectorize.cpp
+++ b/mlir/lib/Transforms/Vectorize.cpp
@ -32,10 +32,10 @@
 #include "mlir/IR/Types.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/StandardOps/Ops.h"
-#include "mlir/SuperVectorOps/SuperVectorOps.h"
 #include "mlir/Support/Functional.h"
 #include "mlir/Support/LLVM.h"
 #include "mlir/Transforms/Passes.h"
+#include "mlir/VectorOps/VectorOps.h"

 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
@ -69,7 +69,7 @@ using namespace mlir;
 /// Some may prefer the terminology a "tile of HW vectors". In this case, one
 /// should note that super-vectors implement an "always full tile" abstraction.
 /// They guarantee no partial-tile separation is necessary by relying on a
-/// high-level copy-reshape abstraction that we call vector_transfer. This
+/// high-level copy-reshape abstraction that we call vector.transfer. This
 /// copy-reshape operations is also responsible for performing layout
 /// transposition if necessary. In the general case this will require a scoped
 /// allocation in some notional local memory.
@ -115,19 +115,17 @@ using namespace mlir;
 /// At a high level, a vectorized load in a loop will resemble:
 /// ```mlir
 ///   affine.for %i = ? to ? step ? {
-///     %v_a = "vector_transfer_read" (A, %i) : (memref<?xf32>, index) ->
-///                                              vector<128xf32>
+///     %v_a = vector.transfer_read A[%i] : memref<?xf32>, vector<128xf32>
 ///   }
 /// ```
-/// It is the reponsibility of the implementation of the vector_transfer_read
-/// to materialize vector registers from the original scalar memrefs.
-/// A later (more target-dependent) lowering pass will materialize to actual HW
-/// vector sizes. This lowering may be occur at different times:
+/// It is the responsibility of the implementation of vector.transfer_read to
+/// materialize vector registers from the original scalar memrefs. A later (more
+/// target-dependent) lowering pass will materialize to actual HW vector sizes.
+/// This lowering may be occur at different times:
 ///   1. at the MLIR level into a combination of loops, unrolling, DmaStartOp +
-///   DmaWaitOp + vectorized operations
-///      for data transformations and shuffle; thus opening opportunities for
-///      unrolling and pipelining. This is an instance of library call
-///      "whiteboxing"; or
+///      DmaWaitOp + vectorized operations for data transformations and shuffle;
+///      thus opening opportunities for unrolling and pipelining. This is an
+///      instance of library call "whiteboxing"; or
 ///   2. later in the a target-specific lowering pass or hand-written library
 ///      call; achieving full separation of concerns. This is an instance of
 ///      library call; or
@ -225,7 +223,7 @@ using namespace mlir;
 /// 3. right after polyhedral-style scheduling: PLUTO-style algorithms are known
 ///    to improve locality, parallelism and be configurable (e.g. max-fuse,
 ///    smart-fuse etc). They can also have adverse effects on contiguity
-///    properties that are required for vectorization but the vector_transfer
+///    properties that are required for vectorization but the vector.transfer
 ///    copy-reshape-pad-transpose abstraction is expected to help recapture
 ///    these properties.
 /// 4. right after polyhedral-style scheduling+tiling;
@ -265,7 +263,7 @@ using namespace mlir;
 ///  3. Then, for each pattern in order:
 ///    a. applying iterative rewriting of the loop and the load operations in
 ///       DFS postorder. Rewriting is implemented by coarsening the loops and
-///       turning load operations into opaque vector_transfer_read ops;
+///       turning load operations into opaque vector.transfer_read ops;
 ///    b. keeping track of the load operations encountered as "roots" and the
 ///       store operations as "terminals";
 ///    c. traversing the use-def chains starting from the roots and iteratively
@ -304,7 +302,7 @@ using namespace mlir;
 ///      %s5 = addf %a5, %b5 : f32`
 ///
 /// Lastly, we show a minimal example for which use-def chains rooted in load /
-/// vector_transfer_read are not enough. This is what motivated splitting
+/// vector.transfer_read are not enough. This is what motivated splitting
 /// terminal processing out of the use-def chains starting from loads. In the
 /// following snippet, there is simply no load::
 /// ```mlir
@ -343,8 +341,7 @@ using namespace mlir;
 /// scheduling, so we want to generate a pattern that resembles:
 /// ```mlir
 ///   affine.for %i = ? to ? step ? {
-///     %v_a = "vector_transfer_read" (A, %i) : (memref<?xf32>, index) ->
-///                                              vector<128xf32>
+///     %v_a = vector.transfer_read A[%i] : memref<?xf32>, vector<128xf32>
 ///   }
 /// ```
 ///
@ -364,8 +361,7 @@ using namespace mlir;
 /// abstraction of size 128 returns code similar to:
 /// ```mlir
 ///   affine.for %i = %M to %N step 128 {
-///     %v_a = "vector_transfer_read" (A, %i) : (memref<?xf32>, index) ->
-///                                              vector<128xf32>
+///     %v_a = vector.transfer_read A[%i] : memref<?xf32>, vector<128xf32>
 ///   }
 /// ```
 ///
@ -373,9 +369,9 @@ using namespace mlir;
 /// ========================================================================
 ///   1. lowering to concrete vector types for various HW;
 ///   2. reduction support;
-///   3. non-effecting padding during vector_transfer_read and filter during
-///      vector_transfer_write;
-///   4. misalignment support vector_transfer_read / vector_transfer_write
+///   3. non-effecting padding during vector.transfer_read and filter during
+///      vector.transfer_write;
+///   4. misalignment support vector.transfer_read / vector.transfer_write
 ///      (hopefully without read-modify-writes);
 ///   5. control-flow support;
 ///   6. cost-models, heuristics and search;
@ -443,24 +439,24 @@ using namespace mlir;
 ///     affine.for %i1 = 0 to %arg1 step 256 {
 ///       %cst_1 = constant splat<vector<256xf32>, 1.0> :
 ///                vector<256xf32>
-///       "vector_transfer_write"(%cst_1, %0, %i0, %i1) :
-///                (vector<256xf32>, memref<?x?xf32>, index, index) -> ()
+///       vector.transfer_write %cst_1, %0[%i0, %i1] :
+///                vector<256xf32>, memref<?x?xf32>
 ///     }
 ///   }
 ///   affine.for %i2 = 0 to %arg0 {
 ///     affine.for %i3 = 0 to %arg1 step 256 {
 ///       %cst_2 = constant splat<vector<256xf32>, 2.0> :
 ///                vector<256xf32>
-///       "vector_transfer_write"(%cst_2, %1, %i2, %i3) :
-///                (vector<256xf32>, memref<?x?xf32>, index, index) -> ()
+///       vector.transfer_write %cst_2, %1[%i2, %i3] :
+///                vector<256xf32>, memref<?x?xf32>
 ///     }
 ///   }
 ///   affine.for %i4 = 0 to %arg0 {
 ///     affine.for %i5 = 0 to %arg1 step 256 {
-///       %3 = "vector_transfer_read"(%0, %i4, %i5) :
-///                      (memref<?x?xf32>, index, index) -> vector<256xf32>
-///       %4 = "vector_transfer_read"(%1, %i4, %i5) :
-///                      (memref<?x?xf32>, index, index) -> vector<256xf32>
+///       %3 = vector.transfer_read %0[%i4, %i5] :
+///            memref<?x?xf32>, vector<256xf32>
+///       %4 = vector.transfer_read %1[%i4, %i5] :
+///            memref<?x?xf32>, vector<256xf32>
 ///       %5 = addf %3, %4 : vector<256xf32>
 ///       %cst_3 = constant splat<vector<256xf32>, 1.0> :
 ///                vector<256xf32>
@ -469,8 +465,8 @@ using namespace mlir;
 ///                vector<256xf32>
 ///       %7 = addf %5, %cst_4 : vector<256xf32>
 ///       %8 = addf %7, %6 : vector<256xf32>
-///       "vector_transfer_write"(%8, %2, %i4, %i5) :
-///                (vector<256xf32>, memref<?x?xf32>, index, index) -> ()
+///       vector.transfer_write %8, %2[%i4, %i5] :
+///                vector<256xf32>, memref<?x?xf32>
 ///     }
 ///   }
 ///   %c7 = constant 7 : index
@ -499,24 +495,24 @@ using namespace mlir;
 ///     affine.for %i1 = 0 to %arg1 step 256 {
 ///       %cst_1 = constant splat<vector<32x256xf32>, 1.0> :
 ///                vector<32x256xf32>
-///       "vector_transfer_write"(%cst_1, %0, %i0, %i1) :
-///                (vector<32x256xf32>, memref<?x?xf32>, index, index) -> ()
+///       vector.transfer_write %cst_1, %0[%i0, %i1] :
+///                vector<32x256xf32>, memref<?x?xf32>
 ///     }
 ///   }
 ///   affine.for %i2 = 0 to %arg0 step 32 {
 ///     affine.for %i3 = 0 to %arg1 step 256 {
 ///       %cst_2 = constant splat<vector<32x256xf32>, 2.0> :
 ///                vector<32x256xf32>
-///       "vector_transfer_write"(%cst_2, %1, %i2, %i3) :
-///                (vector<32x256xf32>, memref<?x?xf32>, index, index) -> ()
+///       vector.transfer_write %cst_2, %1[%i2, %i3] :
+///                vector<32x256xf32>, memref<?x?xf32>
 ///     }
 ///   }
 ///   affine.for %i4 = 0 to %arg0 step 32 {
 ///     affine.for %i5 = 0 to %arg1 step 256 {
-///       %3 = "vector_transfer_read"(%0, %i4, %i5) :
-///                (memref<?x?xf32>, index, index) -> vector<32x256xf32>
-///       %4 = "vector_transfer_read"(%1, %i4, %i5) :
-///                (memref<?x?xf32>, index, index) -> vector<32x256xf32>
+///       %3 = vector.transfer_read %0[%i4, %i5] :
+///                memref<?x?xf32> vector<32x256xf32>
+///       %4 = vector.transfer_read %1[%i4, %i5] :
+///                memref<?x?xf32>, vector<32x256xf32>
 ///       %5 = addf %3, %4 : vector<32x256xf32>
 ///       %cst_3 = constant splat<vector<32x256xf32>, 1.0> :
 ///                vector<32x256xf32>
@ -525,8 +521,8 @@ using namespace mlir;
 ///                vector<32x256xf32>
 ///       %7 = addf %5, %cst_4 : vector<32x256xf32>
 ///       %8 = addf %7, %6 : vector<32x256xf32>
-///       "vector_transfer_write"(%8, %2, %i4, %i5) :
-///                (vector<32x256xf32>, memref<?x?xf32>, index, index) -> ()
+///       vector.transfer_write %8, %2[%i4, %i5] :
+///                vector<32x256xf32>, memref<?x?xf32>
 ///     }
 ///   }
 ///   %c7 = constant 7 : index
@ -788,7 +784,7 @@ void VectorizationState::registerReplacement(Value *key, Value *value) {
 /// Handles the vectorization of load and store MLIR operations.
 ///
 /// LoadOp operations are the roots of the vectorizeNonTerminals call. They are
-/// vectorized immediately. The resulting vector_transfer_read is immediately
+/// vectorized immediately. The resulting vector.transfer_read is immediately
 /// registered to replace all uses of the LoadOp in this pattern's scope.
 ///
 /// StoreOp are the terminals of the vectorizeNonTerminals call. They need to be
@ -811,9 +807,9 @@ static LogicalResult vectorizeRootOrTerminal(Value *iv,

  // Materialize a MemRef with 1 vector.
  auto *opInst = memoryOp.getOperation();
-  // For now, vector_transfers must be aligned, operate only on indices with an
+  // For now, vector.transfers must be aligned, operate only on indices with an
  // identity subset of AffineMap and do not change layout.
-  // TODO(ntv): increase the expressiveness power of vector_transfer operations
+  // TODO(ntv): increase the expressiveness power of vector.transfer operations
  // as needed by various targets.
  if (opInst->template isa<LoadOp>()) {
    auto permutationMap =
@ -835,7 +831,7 @@ static LogicalResult vectorizeRootOrTerminal(Value *iv,
 /// end TODO(ntv): Hoist to a VectorizationMaterialize.cpp when appropriate. ///

 /// Coarsens the loops bounds and transforms all remaining load and store
-/// operations into the appropriate vector_transfer.
+/// operations into the appropriate vector.transfer.
 static LogicalResult vectorizeAffineForOp(AffineForOp loop, int64_t step,
                                          VectorizationState *state) {
  using namespace functional;
@ -1023,9 +1019,9 @@ static Operation *vectorizeOneOperation(Operation *opInst,
  assert(!opInst->isa<LoadOp>() &&
         "all loads must have already been fully vectorized independently");
  assert(!opInst->isa<VectorTransferReadOp>() &&
-         "vector_transfer_read cannot be further vectorized");
+         "vector.transfer_read cannot be further vectorized");
  assert(!opInst->isa<VectorTransferWriteOp>() &&
-         "vector_transfer_write cannot be further vectorized");
+         "vector.transfer_write cannot be further vectorized");

  if (auto store = opInst->dyn_cast<StoreOp>()) {
    auto *memRef = store.getMemRef();
--- a/mlir/lib/SuperVectorOps/DialectRegistration.cpp
+++ b/mlir/lib/SuperVectorOps/DialectRegistration.cpp
@ -15,8 +15,8 @@
 // limitations under the License.
 // =============================================================================

-#include "mlir/SuperVectorOps/SuperVectorOps.h"
+#include "mlir/VectorOps/VectorOps.h"
 using namespace mlir;

-// Static initialization for SuperVectorOps dialect registration.
-static DialectRegistration<SuperVectorOpsDialect> SuperVectorOps;
+// Static initialization for VectorOps dialect registration.
+static DialectRegistration<VectorOpsDialect> VectorOps;
--- a/mlir/lib/SuperVectorOps/SuperVectorOps.cpp
+++ b/mlir/lib/SuperVectorOps/SuperVectorOps.cpp
@ -1,4 +1,4 @@
-//===- SuperVectorOps.cpp - MLIR Super Vectorizer Operations---------------===//
+//===- VectorOps.cpp - MLIR Super Vectorizer Operations -------------------===//
 //
 // Copyright 2019 The MLIR Authors.
 //
@ -20,7 +20,7 @@
 //
 //===----------------------------------------------------------------------===//

-#include "mlir/SuperVectorOps/SuperVectorOps.h"
+#include "mlir/VectorOps/VectorOps.h"
 #include "mlir/IR/AffineExpr.h"
 #include "mlir/IR/AffineMap.h"
 #include "mlir/IR/Builders.h"
@ -29,11 +29,11 @@
 using namespace mlir;

 //===----------------------------------------------------------------------===//
-// SuperVectorOpsDialect
+// VectorOpsDialect
 //===----------------------------------------------------------------------===//

-SuperVectorOpsDialect::SuperVectorOpsDialect(MLIRContext *context)
-    : Dialect(/*namePrefix=*/"", context) {
+VectorOpsDialect::VectorOpsDialect(MLIRContext *context)
+    : Dialect("vector", context) {
  addOperations<VectorTransferReadOp, VectorTransferWriteOp,
                VectorTypeCastOp>();
 }
@ -107,88 +107,65 @@ AffineMap VectorTransferReadOp::getPermutationMap() {
 void VectorTransferReadOp::print(OpAsmPrinter *p) {
  *p << getOperationName() << " ";
  p->printOperand(getMemRef());
-  *p << ", ";
+  *p << "[";
  p->printOperands(getIndices());
+  *p << "]";
  auto optionalPaddingValue = getPaddingValue();
  if (optionalPaddingValue) {
-    *p << ", ";
+    *p << ", (";
    p->printOperand(*optionalPaddingValue);
+    *p << ")";
  }
  p->printOptionalAttrDict(getAttrs());
-  // Construct the FunctionType and print it.
-  llvm::SmallVector<Type, 8> inputs{getMemRefType()};
-  // Must have at least one actual index, see verify.
-  Value *firstIndex = *getIndices().begin();
-  Type indexType = firstIndex->getType();
-  inputs.append(getMemRefType().getRank(), indexType);
-  if (optionalPaddingValue) {
-    inputs.push_back((*optionalPaddingValue)->getType());
-  }
-  *p << " : "
-     << FunctionType::get(inputs, {getResultType()}, indexType.getContext());
+  *p << " : " << getMemRefType();
+  *p << ", " << getResultType();
 }

 bool VectorTransferReadOp::parse(OpAsmParser *parser, OperationState *result) {
-  SmallVector<OpAsmParser::OperandType, 8> parsedOperands;
-  Type type;
+  OpAsmParser::OperandType memrefInfo;
+  SmallVector<OpAsmParser::OperandType, 8> indexInfo;
+  SmallVector<OpAsmParser::OperandType, 8> paddingInfo;
+  SmallVector<Type, 2> types;

  // Parsing with support for optional paddingValue.
-  auto fail = parser->parseOperandList(parsedOperands) ||
-              parser->parseOptionalAttributeDict(result->attributes) ||
-              parser->parseColonType(type);
-  if (fail) {
+  if (parser->parseOperand(memrefInfo) ||
+      parser->parseOperandList(indexInfo, -1, OpAsmParser::Delimiter::Square) ||
+      parser->parseTrailingOperandList(paddingInfo, -1,
+                                       OpAsmParser::Delimiter::Paren) ||
+      parser->parseOptionalAttributeDict(result->attributes) ||
+      parser->parseColonTypeList(types))
    return true;
-  }

  // Resolution.
-  auto funType = type.dyn_cast<FunctionType>();
-  if (!funType)
-    return parser->emitError(parser->getNameLoc(), "Function type expected");
-  if (funType.getNumInputs() < 1)
-    return parser->emitError(parser->getNameLoc(),
-                             "Function type expects at least one input");
-  MemRefType memrefType =
-      funType.getInput(Offsets::MemRefOffset).dyn_cast<MemRefType>();
+  if (types.size() != 2)
+    return parser->emitError(parser->getNameLoc(), "expected 2 types");
+  MemRefType memrefType = types[0].dyn_cast<MemRefType>();
  if (!memrefType)
-    return parser->emitError(parser->getNameLoc(),
-                             "MemRef type expected for first input");
-  if (funType.getNumResults() < 1)
-    return parser->emitError(parser->getNameLoc(),
-                             "Function type expects exactly one vector result");
-  VectorType vectorType = funType.getResult(0).dyn_cast<VectorType>();
+    return parser->emitError(parser->getNameLoc(), "memRef type expected");
+  VectorType vectorType = types[1].dyn_cast<VectorType>();
  if (!vectorType)
-    return parser->emitError(parser->getNameLoc(),
-                             "Vector type expected for first result");
-  if (parsedOperands.size() != funType.getNumInputs())
-    return parser->emitError(parser->getNameLoc(),
-                             "requires " + Twine(funType.getNumInputs()) +
-                                 " operands");
+    return parser->emitError(parser->getNameLoc(), "vector type expected");

  // Extract optional paddingValue.
-  OpAsmParser::OperandType memrefInfo = parsedOperands[0];
  // At this point, indexInfo may contain the optional paddingValue, pop it out.
-  SmallVector<OpAsmParser::OperandType, 8> indexInfo{
-      parsedOperands.begin() + Offsets::FirstIndexOffset, parsedOperands.end()};
+  if (indexInfo.size() != memrefType.getRank())
+    return parser->emitError(parser->getNameLoc(),
+                             "expected " + Twine(memrefType.getRank()) +
+                                 " indices to the memref");
+  if (paddingInfo.size() > 1)
+    return parser->emitError(parser->getNameLoc(),
+                             "expected at most one padding value");
  Type paddingType;
-  OpAsmParser::OperandType paddingValue;
-  bool hasPaddingValue = indexInfo.size() > memrefType.getRank();
-  unsigned expectedNumOperands = Offsets::FirstIndexOffset +
-                                 memrefType.getRank() +
-                                 (hasPaddingValue ? 1 : 0);
-  if (hasPaddingValue) {
-    paddingType = funType.getInputs().back();
-    paddingValue = indexInfo.pop_back_val();
+  bool hasOptionalPaddingValue = !paddingInfo.empty();
+  if (hasOptionalPaddingValue) {
+    paddingType = vectorType.getElementType();
  }
-  if (funType.getNumInputs() != expectedNumOperands)
-    return parser->emitError(
-        parser->getNameLoc(),
-        "requires actual number of operands to match function type");
-
  auto indexType = parser->getBuilder().getIndexType();
  return parser->resolveOperand(memrefInfo, memrefType, result->operands) ||
         parser->resolveOperands(indexInfo, indexType, result->operands) ||
-         (hasPaddingValue && parser->resolveOperand(paddingValue, paddingType,
-                                                    result->operands)) ||
+         (hasOptionalPaddingValue &&
+          parser->resolveOperand(paddingInfo[0], paddingType,
+                                 result->operands)) ||
         parser->addTypeToList(vectorType, result->types);
 }

@ -204,7 +181,7 @@ bool VectorTransferReadOp::verify() {
  // Consistency of vector type in function type.
  if (!getResult()->getType().isa<VectorType>()) {
    return emitOpError("should have a vector result type in function type: "
-                       "(memref_type [, elemental_type]) -> vector_type");
+                       "memref_type<...xelemental_type>, vector_type");
  }
  // Consistency of elemental types in memref and vector.
  MemRefType memrefType = getMemRefType();
@ -219,8 +196,9 @@ bool VectorTransferReadOp::verify() {
                                 (optionalPaddingValue ? 1 : 0);
  // Checks on the actual operands and their types.
  if (getNumOperands() != expectedNumOperands) {
-    return emitOpError("expects " + Twine(expectedNumOperands) +
-                       " operands to match the types");
+    return emitOpError("expects " + Twine(expectedNumOperands) + " operands " +
+                       "(of which " + Twine(memrefType.getRank()) +
+                       " indices)");
  }
  // Consistency of padding value with vector type.
  if (optionalPaddingValue) {
@ -239,7 +217,7 @@ bool VectorTransferReadOp::verify() {
  for (auto *idx : getIndices()) {
    if (!idx->getType().isIndex()) {
      return emitOpError(
-          "index to vector_transfer_read must have 'index' type");
+          "index to vector.transfer_read must have 'index' type");
    }
    ++numIndices;
  }
@ -301,63 +279,41 @@ void VectorTransferWriteOp::print(OpAsmPrinter *p) {
  *p << getOperationName();
  *p << " " << *getVector();
  *p << ", " << *getMemRef();
-  *p << ", ";
+  *p << "[";
  p->printOperands(getIndices());
+  *p << "]";
  p->printOptionalAttrDict(getAttrs());
-  Type indexType = (*getIndices().begin())->getType();
  *p << " : ";
  p->printType(getVectorType());
  *p << ", ";
  p->printType(getMemRefType());
-  for (unsigned r = 0, n = getMemRefType().getRank(); r < n; ++r) {
-    *p << ", ";
-    p->printType(indexType);
-  }
 }

 bool VectorTransferWriteOp::parse(OpAsmParser *parser, OperationState *result) {
-  SmallVector<OpAsmParser::OperandType, 8> parsedOperands;
-  SmallVector<Type, 8> types;
-
-  // Parsing with support for optional paddingValue.
-  auto fail = parser->parseOperandList(parsedOperands) ||
-              parser->parseOptionalAttributeDict(result->attributes) ||
-              parser->parseColonTypeList(types);
-  if (fail) {
+  OpAsmParser::OperandType storeValueInfo;
+  OpAsmParser::OperandType memrefInfo;
+  SmallVector<OpAsmParser::OperandType, 4> indexInfo;
+  SmallVector<Type, 2> types;
+  auto indexType = parser->getBuilder().getIndexType();
+  if (parser->parseOperand(storeValueInfo) || parser->parseComma() ||
+      parser->parseOperand(memrefInfo) ||
+      parser->parseOperandList(indexInfo, -1, OpAsmParser::Delimiter::Square) ||
+      parser->parseOptionalAttributeDict(result->attributes) ||
+      parser->parseColonTypeList(types))
    return true;
-  }

-  // Resolution.
-  if (parsedOperands.size() != types.size())
-    return parser->emitError(
-        parser->getNameLoc(),
-        "requires number of operands and input types to match");
-  if (parsedOperands.size() < Offsets::FirstIndexOffset)
-    return parser->emitError(parser->getNameLoc(),
-                             "requires at least vector and memref operands");
+  if (types.size() != 2)
+    return parser->emitError(parser->getNameLoc(), "expected 2 types");
  VectorType vectorType = types[Offsets::VectorOffset].dyn_cast<VectorType>();
  if (!vectorType)
-    return parser->emitError(parser->getNameLoc(),
-                             "Vector type expected for first input type");
+    return parser->emitError(parser->getNameLoc(), "vector type expected");
  MemRefType memrefType = types[Offsets::MemRefOffset].dyn_cast<MemRefType>();
  if (!memrefType)
-    return parser->emitError(parser->getNameLoc(),
-                             "MemRef type expected for second input type");
+    return parser->emitError(parser->getNameLoc(), "memRef type expected");

-  unsigned expectedNumOperands =
-      Offsets::FirstIndexOffset + memrefType.getRank();
-  if (parsedOperands.size() != expectedNumOperands)
-    return parser->emitError(parser->getNameLoc(),
-                             "requires " + Twine(expectedNumOperands) +
-                                 " operands");
-
-  OpAsmParser::OperandType vectorInfo = parsedOperands[Offsets::VectorOffset];
-  OpAsmParser::OperandType memrefInfo = parsedOperands[Offsets::MemRefOffset];
-  SmallVector<OpAsmParser::OperandType, 8> indexInfo{
-      parsedOperands.begin() + Offsets::FirstIndexOffset, parsedOperands.end()};
-  auto indexType = parser->getBuilder().getIndexType();
-  return parser->resolveOperand(vectorInfo, vectorType, result->operands) ||
-         parser->resolveOperand(memrefInfo, memrefType, result->operands) ||
+  return parser->resolveOperands(storeValueInfo, vectorType,
+                                 result->operands) ||
+         parser->resolveOperands(memrefInfo, memrefType, result->operands) ||
         parser->resolveOperands(indexInfo, indexType, result->operands);
 }

@ -386,15 +342,16 @@ bool VectorTransferWriteOp::verify() {
      Offsets::FirstIndexOffset + memrefType.getRank();
  // Checks on the actual operands and their types.
  if (getNumOperands() != expectedNumOperands) {
-    return emitOpError("expects " + Twine(expectedNumOperands) +
-                       " operands to match the types");
+    return emitOpError("expects " + Twine(expectedNumOperands) + " operands " +
+                       "(of which " + Twine(memrefType.getRank()) +
+                       " indices)");
  }
  // Consistency of indices types.
  unsigned numIndices = 0;
  for (auto *idx : getIndices()) {
    if (!idx->getType().isIndex()) {
      return emitOpError(
-          "index to vector_transfer_write must have 'index' type");
+          "index to vector.transfer_write must have 'index' type");
    }
    numIndices++;
  }
--- a/mlir/test/EDSC/builder-api-test.cpp
+++ b/mlir/test/EDSC/builder-api-test.cpp
@ -565,10 +565,10 @@ TEST_FUNC(vectorize_2d) {
  //  CHECK-NEXT: affine.for %i0 = 0 to (d0) -> (d0)(%[[M]]) {
  //  CHECK-NEXT:   affine.for %i1 = 0 to (d0) -> (d0)(%[[N]]) step 4 {
  //  CHECK-NEXT:     affine.for %i2 = 0 to (d0) -> (d0)(%[[P]]) step 4 {
-  //  CHECK-NEXT:       %[[vA:.*]] = "vector_transfer_read"(%arg1, %i0, %i1, %i2) {permutation_map: (d0, d1, d2) -> (d1, d2)} : (memref<?x?x?xf32>, index, index, index) -> vector<4x4xf32>
-  //  CHECK-NEXT:       %[[vB:.*]] =  "vector_transfer_read"(%arg0, %i0, %i1, %i2) {permutation_map: (d0, d1,  d2) -> (d1, d2)} : (memref<?x?x?xf32>, index, index, index) -> vector<4x4xf32>
+  //  CHECK-NEXT:       %[[vA:.*]] = "vector.transfer_read"(%arg1, %i0, %i1, %i2) {permutation_map: (d0, d1, d2) -> (d1, d2)} : (memref<?x?x?xf32>, index, index, index) -> vector<4x4xf32>
+  //  CHECK-NEXT:       %[[vB:.*]] =  "vector.transfer_read"(%arg0, %i0, %i1, %i2) {permutation_map: (d0, d1,  d2) -> (d1, d2)} : (memref<?x?x?xf32>, index, index, index) -> vector<4x4xf32>
  //  CHECK-NEXT:       %[[vRES:.*]] = addf %[[vB]], %[[vA]] : vector<4x4xf32>
-  //  CHECK-NEXT:       "vector_transfer_write"(%[[vRES:.*]], %arg2, %i0, %i1, %i2) {permutation_map: (d0, d1, d2) -> (d1, d2)} : (vector<4x4xf32>, memref<?x?x?xf32>, index, index, index) -> ()
+  //  CHECK-NEXT:       "vector.transfer_write"(%[[vRES:.*]], %arg2, %i0, %i1, %i2) {permutation_map: (d0, d1, d2) -> (d1, d2)} : (vector<4x4xf32>, memref<?x?x?xf32>, index, index, index) -> ()
  // clang-format on

  mlir::PassManager pm;
--- a/mlir/test/IR/core-ops.mlir
+++ b/mlir/test/IR/core-ops.mlir
@ -330,22 +330,22 @@ func @test_dimop(%arg0: tensor<4x4x?xf32>) {
 }


-// CHECK-LABEL: func @test_vector_transfer_ops(%arg0
-func @test_vector_transfer_ops(%arg0: memref<?x?xf32>) {
+// CHECK-LABEL: func @test_vector.transfer_ops(%arg0
+func @test_vector.transfer_ops(%arg0: memref<?x?xf32>) {
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
-  // CHECK: %0 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: #[[map_proj_d0d1_d0]]} : (memref<?x?xf32>, index, index) -> vector<128xf32>
-  %0 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : (memref<?x?xf32>, index, index) -> vector<128xf32>
-  // CHECK: %1 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: #[[map_proj_d0d1_d1d0]]} : (memref<?x?xf32>, index, index) -> vector<3x7xf32>
-  %1 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d1, d0)} : (memref<?x?xf32>, index, index) -> vector<3x7xf32>
-  // CHECK: %2 = vector_transfer_read %arg0, %c3, %c3, %cst {permutation_map: #[[map_proj_d0d1_d0]]} : (memref<?x?xf32>, index, index, f32) -> vector<128xf32>
-  %2 = vector_transfer_read %arg0, %c3, %c3, %cst {permutation_map: (d0, d1)->(d0)} : (memref<?x?xf32>, index, index, f32) -> vector<128xf32>
-  // CHECK: %3 = vector_transfer_read %arg0, %c3, %c3, %cst {permutation_map: #[[map_proj_d0d1_d1]]} : (memref<?x?xf32>, index, index, f32) -> vector<128xf32>
-  %3 = vector_transfer_read %arg0, %c3, %c3, %cst {permutation_map: (d0, d1)->(d1)} : (memref<?x?xf32>, index, index, f32) -> vector<128xf32>
+  // CHECK: %0 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: #[[map_proj_d0d1_d0]]} : memref<?x?xf32>, vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : memref<?x?xf32>, vector<128xf32>
+  // CHECK: %1 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: #[[map_proj_d0d1_d1d0]]} : memref<?x?xf32>, vector<3x7xf32>
+  %1 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d1, d0)} : memref<?x?xf32>, vector<3x7xf32>
+  // CHECK: %2 = vector.transfer_read %arg0[%c3, %c3], (%cst) {permutation_map: #[[map_proj_d0d1_d0]]} : memref<?x?xf32>,  vector<128xf32>
+  %2 = vector.transfer_read %arg0[%c3, %c3], (%cst) {permutation_map: (d0, d1)->(d0)} : memref<?x?xf32>,  vector<128xf32>
+  // CHECK: %3 = vector.transfer_read %arg0[%c3, %c3], (%cst) {permutation_map: #[[map_proj_d0d1_d1]]} : memref<?x?xf32>,  vector<128xf32>
+  %3 = vector.transfer_read %arg0[%c3, %c3], (%cst) {permutation_map: (d0, d1)->(d1)} : memref<?x?xf32>,  vector<128xf32>
  //
-  // CHECK: vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: #[[map_proj_d0d1_d0]]} : vector<128xf32>, memref<?x?xf32>, index, index
-  vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index
-  // CHECK: vector_transfer_write %1, %arg0, %c3, %c3 {permutation_map: #[[map_proj_d0d1_d1d0]]} : vector<3x7xf32>, memref<?x?xf32>, index, index
-  vector_transfer_write %1, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d1, d0)} : vector<3x7xf32>, memref<?x?xf32>, index, index
+  // CHECK: vector.transfer_write %0, %arg0[%c3, %c3] {permutation_map: #[[map_proj_d0d1_d0]]} : vector<128xf32>, memref<?x?xf32>
+  vector.transfer_write %0, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>
+  // CHECK: vector.transfer_write %1, %arg0[%c3, %c3] {permutation_map: #[[map_proj_d0d1_d1d0]]} : vector<3x7xf32>, memref<?x?xf32>
+  vector.transfer_write %1, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d1, d0)} : vector<3x7xf32>, memref<?x?xf32>
  return
 }
--- a/mlir/test/IR/invalid-ops.mlir
+++ b/mlir/test/IR/invalid-ops.mlir
@ -296,180 +296,181 @@ func @func_with_ops(tensor<?xi1>, tensor<42xi32>, tensor<42xi32>) {

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
-  // expected-error@+1 {{expected 4 operand types but had 3}}
-  %0 = "vector_transfer_read"(%arg0, %c3, %c3, %c3) : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  // expected-error@+1 {{expected 2 types}}
+  %0 = vector.transfer_read %arg0[%c3, %c3] : memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
-  // expected-error@+1 {{requires 3 operands}}
-  %0 = vector_transfer_read %arg0, %c3, %c3, %c3 : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  // expected-error@+1 {{expected 2 indices to the memref}}
+  %0 = vector.transfer_read %arg0[%c3, %c3, %c3] : memref<?x?xf32>, vector<128xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires an AffineMapAttr named 'permutation_map'}}
-  %0 = vector_transfer_read %arg0, %c3, %c3 : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] : memref<?x?xf32>, vector<128xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires an AffineMapAttr named 'permutation_map'}}
-  %0 = vector_transfer_read %arg0, %c3, %c3 {perm: (d0)->(d0)} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] {perm: (d0)->(d0)} : memref<?x?xf32>, vector<128xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires a permutation_map with input dims of the same rank as the memref type}}
-  %0 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: (d0)->(d0)} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: (d0)->(d0)} : memref<?x?xf32>, vector<128xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires a permutation_map with result dims of the same rank as the vector type}}
-  %0 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0, d1)} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0, d1)} : memref<?x?xf32>, vector<128xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
-  %0 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0 + d1)} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0 + d1)} : memref<?x?xf32>, vector<128xf32>
 }

 // -----

-func @test_vector_transfer_read(memref<?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
-  %0 = vector_transfer_read %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0 + 1)} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0 + 1)} : memref<?x?xf32>, vector<128xf32>
 }
+
 // -----

-func @test_vector_transfer_read(memref<?x?x?xf32>) {
+func @test_vector.transfer_read(memref<?x?x?xf32>) {
 ^bb0(%arg0: memref<?x?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant 3.0 : f32
  // expected-error@+1 {{requires a permutation_map that is a permutation (found one dim used more than once)}}
-  %0 = vector_transfer_read %arg0, %c3, %c3, %c3 {permutation_map: (d0, d1, d2)->(d0, d0)} : (memref<?x?x?xf32>, index, index, index) -> vector<3x7xf32>
+  %0 = vector.transfer_read %arg0[%c3, %c3, %c3] {permutation_map: (d0, d1, d2)->(d0, d0)} : memref<?x?x?xf32>, vector<3x7xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{expected 5 operand types but had 4}}
-  %0 = "vector_transfer_write"(%cst, %arg0, %c3, %c3, %c3) : (vector<128xf32>, memref<?x?xf32>, index, index) -> ()
+  %0 = "vector.transfer_write"(%cst, %arg0, %c3, %c3, %c3) : (vector<128xf32>, memref<?x?xf32>, index, index) -> ()
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
-  // expected-error@+1 {{requires number of operands and input types to match}}
-  vector_transfer_write %cst, %arg0, %c3, %c3, %c3 : vector<128xf32>, memref<?x?xf32>, index, index
+  // expected-error@+1 {{expects 4 operands (of which 2 indices)}}
+  vector.transfer_write %cst, %arg0[%c3, %c3, %c3] : vector<128xf32>, memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{requires an AffineMapAttr named 'permutation_map'}}
-  vector_transfer_write %cst, %arg0, %c3, %c3 : vector<128xf32>, memref<?x?xf32>, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3] : vector<128xf32>, memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{requires an AffineMapAttr named 'permutation_map'}}
-  vector_transfer_write %cst, %arg0, %c3, %c3 {perm: (d0)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3] {perm: (d0)->(d0)} : vector<128xf32>, memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{requires a permutation_map with input dims of the same rank as the memref type}}
-  vector_transfer_write %cst, %arg0, %c3, %c3 {permutation_map: (d0)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map: (d0)->(d0)} : vector<128xf32>, memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{requires a permutation_map with result dims of the same rank as the vector type}}
-  vector_transfer_write %cst, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0, d1)} : vector<128xf32>, memref<?x?xf32>, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0, d1)} : vector<128xf32>, memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
-  vector_transfer_write %cst, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0 + d1)} : vector<128xf32>, memref<?x?xf32>, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0 + d1)} : vector<128xf32>, memref<?x?xf32>
 }

 // -----

-func @test_vector_transfer_write(memref<?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?xf32>) {
 ^bb0(%arg0: memref<?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<128 x f32>, 3.0>  : vector<128 x f32>
  // expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
-  vector_transfer_write %cst, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0 + 1)} : vector<128xf32>, memref<?x?xf32>, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0 + 1)} : vector<128xf32>, memref<?x?xf32>
 }
 // -----

-func @test_vector_transfer_write(memref<?x?x?xf32>) {
+func @test_vector.transfer_write(memref<?x?x?xf32>) {
 ^bb0(%arg0: memref<?x?x?xf32>):
  %c3 = constant 3 : index
  %cst = constant splat<vector<3 x 7 x f32>, 3.0>  : vector<3 x 7 x f32>
  // expected-error@+1 {{requires a permutation_map that is a permutation (found one dim used more than once)}}
-  vector_transfer_write %cst, %arg0, %c3, %c3, %c3 {permutation_map: (d0, d1, d2)->(d0, d0)} : vector<3x7xf32>, memref<?x?x?xf32>, index, index, index
+  vector.transfer_write %cst, %arg0[%c3, %c3, %c3] {permutation_map: (d0, d1, d2)->(d0, d0)} : vector<3x7xf32>, memref<?x?x?xf32>
 }

 // -----
--- a/mlir/test/Transforms/Vectorize/lower_vector_transfers.mlir
+++ b/mlir/test/Transforms/Vectorize/lower_vector_transfers.mlir
@ -8,13 +8,13 @@ func @materialize_read_1d() {
  %A = alloc () : memref<7x42xf32>
  affine.for %i0 = 0 to 7 step 4 {
    affine.for %i1 = 0 to 42 step 4 {
-      %f1 = vector_transfer_read %A, %i0, %i1 {permutation_map: (d0, d1) -> (d0)} : (memref<7x42xf32>, index, index) -> vector<4xf32>
+      %f1 = vector.transfer_read %A[%i0, %i1] {permutation_map: (d0, d1) -> (d0)} : memref<7x42xf32>, vector<4xf32>
      %ip1 = affine.apply (d0) -> (d0 + 1) (%i1)
-      %f2 = vector_transfer_read %A, %i0, %ip1 {permutation_map: (d0, d1) -> (d0)} : (memref<7x42xf32>, index, index) -> vector<4xf32>
+      %f2 = vector.transfer_read %A[%i0, %ip1] {permutation_map: (d0, d1) -> (d0)} : memref<7x42xf32>, vector<4xf32>
      %ip2 = affine.apply (d0) -> (d0 + 2) (%i1)
-      %f3 = vector_transfer_read %A, %i0, %ip2 {permutation_map: (d0, d1) -> (d0)} : (memref<7x42xf32>, index, index) -> vector<4xf32>
+      %f3 = vector.transfer_read %A[%i0, %ip2] {permutation_map: (d0, d1) -> (d0)} : memref<7x42xf32>, vector<4xf32>
      %ip3 = affine.apply (d0) -> (d0 + 3) (%i1)
-      %f4 = vector_transfer_read %A, %i0, %ip3 {permutation_map: (d0, d1) -> (d0)} : (memref<7x42xf32>, index, index) -> vector<4xf32>
+      %f4 = vector.transfer_read %A[%i0, %ip3] {permutation_map: (d0, d1) -> (d0)} : memref<7x42xf32>, vector<4xf32>
      // Both accesses in the load must be clipped otherwise %i1 + 2 and %i1 + 3 will go out of bounds.
      // CHECK: {{.*}} = select
      // CHECK: %[[FILTERED1:.*]] = select
@ -34,9 +34,9 @@ func @materialize_read_1d_partially_specialized(%dyn1 : index, %dyn2 : index, %d
      affine.for %i2 = 0 to %dyn2 {
        affine.for %i3 = 0 to 42 step 2 {
          affine.for %i4 = 0 to %dyn4 {
-            %f1 = vector_transfer_read %A, %i0, %i1, %i2, %i3, %i4 {permutation_map: (d0, d1, d2, d3, d4) -> (d3)} : ( memref<7x?x?x42x?xf32>, index, index, index, index, index) -> vector<4xf32>
+            %f1 = vector.transfer_read %A[%i0, %i1, %i2, %i3, %i4] {permutation_map: (d0, d1, d2, d3, d4) -> (d3)} : memref<7x?x?x42x?xf32>, vector<4xf32>
            %i3p1 = affine.apply (d0) -> (d0 + 1) (%i3)
-            %f2 = vector_transfer_read %A, %i0, %i1, %i2, %i3p1, %i4 {permutation_map: (d0, d1, d2, d3, d4) -> (d3)} : ( memref<7x?x?x42x?xf32>, index, index, index, index, index) -> vector<4xf32>
+            %f2 = vector.transfer_read %A[%i0, %i1, %i2, %i3p1, %i4] {permutation_map: (d0, d1, d2, d3, d4) -> (d3)} : memref<7x?x?x42x?xf32>, vector<4xf32>
          }
        }
      }
@ -63,7 +63,7 @@ func @materialize_read(%M: index, %N: index, %O: index, %P: index) {
  // CHECK-NEXT:          %[[D2:.*]] = dim %0, 2 : memref<?x?x?x?xf32>
  // CHECK-NEXT:          %[[D3:.*]] = dim %0, 3 : memref<?x?x?x?xf32>
  //      CHECK:          %[[ALLOC:.*]] = alloc() : memref<5x4x3xf32>
-  // CHECK-NEXT:          %[[VECTOR_VIEW:.*]] = vector_type_cast %[[ALLOC]] : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
+  // CHECK-NEXT:          %[[VECTOR_VIEW:.*]] = vector.type_cast %[[ALLOC]] : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
  // CHECK-NEXT:          affine.for %[[I4:.*]] = 0 to 3 {
  // CHECK-NEXT:            affine.for %[[I5:.*]] = 0 to 4 {
  // CHECK-NEXT:              affine.for %[[I6:.*]] = 0 to 5 {
@ -121,7 +121,7 @@ func @materialize_read(%M: index, %N: index, %O: index, %P: index) {
    affine.for %i1 = 0 to %N {
      affine.for %i2 = 0 to %O {
        affine.for %i3 = 0 to %P step 5 {
-          %f = vector_transfer_read %A, %i0, %i1, %i2, %i3 {permutation_map: (d0, d1, d2, d3) -> (d3, 0, d0)} : (memref<?x?x?x?xf32, 0>, index, index, index, index) -> vector<5x4x3xf32>
+          %f = vector.transfer_read %A[%i0, %i1, %i2, %i3] {permutation_map: (d0, d1, d2, d3) -> (d3, 0, d0)} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
        }
      }
    }
@ -142,7 +142,7 @@ func @materialize_write(%M: index, %N: index, %O: index, %P: index) {
  // CHECK-NEXT:          %[[D2:.*]] = dim %0, 2 : memref<?x?x?x?xf32>
  // CHECK-NEXT:          %[[D3:.*]] = dim %0, 3 : memref<?x?x?x?xf32>
  // CHECK:               %[[ALLOC:.*]] = alloc() : memref<5x4x3xf32>
-  // CHECK-NEXT:          %[[VECTOR_VIEW:.*]] = vector_type_cast {{.*}} : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
+  // CHECK-NEXT:          %[[VECTOR_VIEW:.*]] = vector.type_cast {{.*}} : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
  //      CHECK:          store %cst, {{.*}} : memref<1xvector<5x4x3xf32>>
  // CHECK-NEXT:          affine.for %[[I4:.*]] = 0 to 3 {
  // CHECK-NEXT:            affine.for %[[I5:.*]] = 0 to 4 {
@ -205,7 +205,7 @@ func @materialize_write(%M: index, %N: index, %O: index, %P: index) {
    affine.for %i1 = 0 to %N step 4 {
      affine.for %i2 = 0 to %O {
        affine.for %i3 = 0 to %P step 5 {
-          vector_transfer_write %f1, %A, %i0, %i1, %i2, %i3 {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} : vector<5x4x3xf32>, memref<?x?x?x?xf32, 0>, index, index, index, index
+          vector.transfer_write %f1, %A[%i0, %i1, %i2, %i3] {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} : vector<5x4x3xf32>, memref<?x?x?x?xf32>
        }
      }
    }
--- a/mlir/test/Transforms/Vectorize/materialize.mlir
+++ b/mlir/test/Transforms/Vectorize/materialize.mlir
@ -18,18 +18,18 @@ func @materialize(%M : index, %N : index, %O : index, %P : index) {
  // CHECK-NEXT:          %[[b:[0-9]+]] = {{.*}}[[ID1]](%i1)
  // CHECK-NEXT:          %[[c:[0-9]+]] = {{.*}}[[ID1]](%i2)
  // CHECK-NEXT:          %[[d:[0-9]+]] = {{.*}}[[ID1]](%i3)
-  // CHECK-NEXT:          vector_transfer_write {{.*}}, %0, %[[a]], %[[b]], %[[c]], %[[d]] {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>, index, index, index, index
+  // CHECK-NEXT:          vector.transfer_write {{.*}}, %0[%[[a]], %[[b]], %[[c]], %[[d]]] {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>
  // CHECK:          %[[b1:[0-9]+]] = {{.*}}[[D0P1]](%i1)
-  // CHECK:          vector_transfer_write {{.*}}, %0, {{.*}}, %[[b1]], {{.*}} {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>, index, index, index, index
+  // CHECK:          vector.transfer_write {{.*}}, %0[{{.*}}, %[[b1]], {{.*}}] {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>
  // CHECK:          %[[b2:[0-9]+]] = {{.*}}[[D0P2]](%i1)
-  // CHECK:          vector_transfer_write {{.*}}, %0, {{.*}}, %[[b2]], {{.*}} {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>, index, index, index, index
+  // CHECK:          vector.transfer_write {{.*}}, %0[{{.*}}, %[[b2]], {{.*}}] {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>
  // CHECK:          %[[b3:[0-9]+]] = {{.*}}[[D0P3]](%i1)
-  // CHECK:          vector_transfer_write {{.*}}, %0, {{.*}}, %[[b3]], {{.*}} {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>, index, index, index, index
+  // CHECK:          vector.transfer_write {{.*}}, %0[{{.*}}, %[[b3]], {{.*}}] {permutation_map: #[[D0D1D2D3TOD1D0]]} : vector<4x4xf32>, memref<?x?x?x?xf32>
  affine.for %i0 = 0 to %M step 4 {
    affine.for %i1 = 0 to %N step 4 {
      affine.for %i2 = 0 to %O {
        affine.for %i3 = 0 to %P step 4 {
-          "vector_transfer_write"(%f1, %A, %i0, %i1, %i2, %i3) {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} : (vector<4x4x4xf32>, memref<?x?x?x?xf32, 0>, index, index, index, index) -> ()
+          vector.transfer_write %f1, %A[%i0, %i1, %i2, %i3] {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} : vector<4x4x4xf32>, memref<?x?x?x?xf32, 0>
        }
      }
    }
--- a/mlir/test/Transforms/Vectorize/materialize_vectors_1d_to_1d.mlir
+++ b/mlir/test/Transforms/Vectorize/materialize_vectors_1d_to_1d.mlir
@ -17,22 +17,22 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // 4x unroll (jammed by construction).
  // CHECK: affine.for %i0 = 0 to %arg0 {
  // CHECK-NEXT:   affine.for %i1 = 0 to %arg1 step 32 {
-  // CHECK-NEXT:     [[CST0:%.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[CST1:%.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[CST2:%.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[CST3:%.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[VAL00:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL01:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST0]], {{.*}}, [[VAL00]], [[VAL01]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL10:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL11:%.*]] = affine.apply [[D0P8]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST1]], {{.*}}, [[VAL10]], [[VAL11]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL20:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL21:%.*]] = affine.apply [[D0P16]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST2]], {{.*}}, [[VAL20]], [[VAL21]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL30:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL31:%.*]] = affine.apply [[D0P24]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST3]], {{.*}}, [[VAL30]], [[VAL31]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
+  // CHECK-NEXT:     %[[CST0:.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[CST1:.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[CST2:.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[CST3:.*]] = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[VAL00:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL01:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST0]], {{.*}}[%[[VAL00]], %[[VAL01]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL10:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL11:.*]] = affine.apply [[D0P8]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST1]], {{.*}}[%[[VAL10]], %[[VAL11]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL20:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL21:.*]] = affine.apply [[D0P16]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST2]], {{.*}}[%[[VAL20]], %[[VAL21]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL30:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL31:.*]] = affine.apply [[D0P24]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST3]], {{.*}}[%[[VAL30]], %[[VAL31]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
  //
  affine.for %i0 = 0 to %M {
    affine.for %i1 = 0 to %N {
@ -43,22 +43,22 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // 4x unroll (jammed by construction).
  // CHECK: affine.for %i2 = 0 to %arg0 {
  // CHECK-NEXT:   affine.for %i3 = 0 to %arg1 step 32 {
-  // CHECK-NEXT:     [[CST0:%.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[CST1:%.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[CST2:%.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[CST3:%.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[VAL00:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL01:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST0]], {{.*}}, [[VAL00]], [[VAL01]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL10:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL11:%.*]] = affine.apply [[D0P8]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST1]], {{.*}}, [[VAL10]], [[VAL11]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL20:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL21:%.*]] = affine.apply [[D0P16]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST2]], {{.*}}, [[VAL20]], [[VAL21]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL30:%.*]] = affine.apply [[ID1]]{{.*}}
-  // CHECK-NEXT:     [[VAL31:%.*]] = affine.apply [[D0P24]]{{.*}}
-  // CHECK-NEXT:     vector_transfer_write [[CST3]], {{.*}}, [[VAL30]], [[VAL31]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
+  // CHECK-NEXT:     %[[CST0:.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[CST1:.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[CST2:.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[CST3:.*]] = constant splat<vector<8xf32>, 2.000000e+00> : vector<8xf32>
+  // CHECK-NEXT:     %[[VAL00:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL01:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST0]], {{.*}}[%[[VAL00]], %[[VAL01]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL10:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL11:.*]] = affine.apply [[D0P8]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST1]], {{.*}}[%[[VAL10]], %[[VAL11]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL20:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL21:.*]] = affine.apply [[D0P16]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST2]], {{.*}}[%[[VAL20]], %[[VAL21]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL30:.*]] = affine.apply [[ID1]]{{.*}}
+  // CHECK-NEXT:     %[[VAL31:.*]] = affine.apply [[D0P24]]{{.*}}
+  // CHECK-NEXT:     vector.transfer_write %[[CST3]], {{.*}}[%[[VAL30]], %[[VAL31]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
  //
  affine.for %i2 = 0 to %M {
    affine.for %i3 = 0 to %N {
@ -71,44 +71,44 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:   affine.for %i5 = 0 to %arg1 step 32 {
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  //
  affine.for %i4 = 0 to %M {
    affine.for %i5 = 0 to %N {
--- a/mlir/test/Transforms/Vectorize/materialize_vectors_2d_to_1d.mlir
+++ b/mlir/test/Transforms/Vectorize/materialize_vectors_2d_to_1d.mlir
@ -23,24 +23,24 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:     {{.*}} = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = constant splat<vector<8xf32>, 1.000000e+00> : vector<8xf32>
-  // CHECK-NEXT:     [[VAL00:%.*]] = affine.apply [[ID1]](%i0)
-  // CHECK-NEXT:     [[VAL01:%.*]] = affine.apply [[ID1]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL00]], [[VAL01]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL10:%.*]] = affine.apply [[ID1]](%i0)
-  // CHECK-NEXT:     [[VAL11:%.*]] = affine.apply [[D0P8]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL10]], [[VAL11]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL20:%.*]] = affine.apply [[D0P1]](%i0)
-  // CHECK-NEXT:     [[VAL21:%.*]] = affine.apply [[ID1]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL20]], [[VAL21]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL30:%.*]] = affine.apply [[D0P1]](%i0)
-  // CHECK-NEXT:     [[VAL31:%.*]] = affine.apply [[D0P8]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL30]], [[VAL31]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL40:%.*]] = affine.apply [[D0P2]](%i0)
-  // CHECK-NEXT:     [[VAL41:%.*]] = affine.apply [[ID1]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL40]], [[VAL41]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
-  // CHECK-NEXT:     [[VAL50:%.*]] = affine.apply [[D0P2]](%i0)
-  // CHECK-NEXT:     [[VAL51:%.*]] = affine.apply [[D0P8]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL50]], [[VAL51]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>
+  // CHECK-NEXT:     %[[VAL00:.*]] = affine.apply [[ID1]](%i0)
+  // CHECK-NEXT:     %[[VAL01:.*]] = affine.apply [[ID1]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL00]], %[[VAL01]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL10:.*]] = affine.apply [[ID1]](%i0)
+  // CHECK-NEXT:     %[[VAL11:.*]] = affine.apply [[D0P8]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL10]], %[[VAL11]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL20:.*]] = affine.apply [[D0P1]](%i0)
+  // CHECK-NEXT:     %[[VAL21:.*]] = affine.apply [[ID1]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL20]], %[[VAL21]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL30:.*]] = affine.apply [[D0P1]](%i0)
+  // CHECK-NEXT:     %[[VAL31:.*]] = affine.apply [[D0P8]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL30]], %[[VAL31]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL40:.*]] = affine.apply [[D0P2]](%i0)
+  // CHECK-NEXT:     %[[VAL41:.*]] = affine.apply [[ID1]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL40]], %[[VAL41]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL50:.*]] = affine.apply [[D0P2]](%i0)
+  // CHECK-NEXT:     %[[VAL51:.*]] = affine.apply [[D0P8]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL50]], %[[VAL51]]] {permutation_map: [[D0D1TOD1]]} : vector<8xf32>, memref<?x?xf32>
  affine.for %i0 = 0 to %M {
    affine.for %i1 = 0 to %N {
      // non-scoped %f1
@ -63,40 +63,40 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:   affine.for %i5 = 0 to %arg1 step 16 {
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
@ -105,22 +105,22 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<8xf32>
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  //
  affine.for %i4 = 0 to %M {
    affine.for %i5 = 0 to %N {
--- a/mlir/test/Transforms/Vectorize/materialize_vectors_2d_to_2d.mlir
+++ b/mlir/test/Transforms/Vectorize/materialize_vectors_2d_to_2d.mlir
@ -17,12 +17,12 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:   affine.for %i1 = 0 to %arg1 step 32 {
  // CHECK-NEXT:     {{.*}} = constant splat<vector<3x16xf32>, 1.000000e+00> : vector<3x16xf32>
  // CHECK-NEXT:     {{.*}} = constant splat<vector<3x16xf32>, 1.000000e+00> : vector<3x16xf32>
-  // CHECK-NEXT:     [[VAL00:%.*]] = affine.apply [[ID1]](%i0)
-  // CHECK-NEXT:     [[VAL01:%.*]] = affine.apply [[ID1]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL00]], [[VAL01]] {permutation_map: [[ID2]]} : vector<3x16xf32>
-  // CHECK-NEXT:     [[VAL10:%.*]] = affine.apply [[ID1]](%i0)
-  // CHECK-NEXT:     [[VAL11:%.*]] = affine.apply [[D0P16]](%i1)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL10]], [[VAL11]] {permutation_map: [[ID2]]} : vector<3x16xf32>
+  // CHECK-NEXT:     %[[VAL00:.*]] = affine.apply [[ID1]](%i0)
+  // CHECK-NEXT:     %[[VAL01:.*]] = affine.apply [[ID1]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL00]], %[[VAL01]]] {permutation_map: [[ID2]]} : vector<3x16xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL10:.*]] = affine.apply [[ID1]](%i0)
+  // CHECK-NEXT:     %[[VAL11:.*]] = affine.apply [[D0P16]](%i1)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL10]], %[[VAL11]]] {permutation_map: [[ID2]]} : vector<3x16xf32>, memref<?x?xf32>
  //
  affine.for %i0 = 0 to %M {
    affine.for %i1 = 0 to %N {
@ -35,12 +35,12 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:   affine.for %i3 = 0 to %arg1 step 32 {
  // CHECK-NEXT:     {{.*}} = constant splat<vector<3x16xf32>, 2.000000e+00> : vector<3x16xf32>
  // CHECK-NEXT:     {{.*}} = constant splat<vector<3x16xf32>, 2.000000e+00> : vector<3x16xf32>
-  // CHECK-NEXT:     [[VAL00:%.*]] = affine.apply [[ID1]](%i2)
-  // CHECK-NEXT:     [[VAL01:%.*]] = affine.apply [[ID1]](%i3)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL00]], [[VAL01]] {permutation_map: [[ID2]]} : vector<3x16xf32>
-  // CHECK-NEXT:     [[VAL10:%.*]] = affine.apply [[ID1]](%i2)
-  // CHECK-NEXT:     [[VAL11:%.*]] = affine.apply [[D0P16]](%i3)
-  // CHECK-NEXT:     vector_transfer_write {{.*}}, {{.*}}, [[VAL10]], [[VAL11]] {permutation_map: [[ID2]]} : vector<3x16xf32>
+  // CHECK-NEXT:     %[[VAL00:.*]] = affine.apply [[ID1]](%i2)
+  // CHECK-NEXT:     %[[VAL01:.*]] = affine.apply [[ID1]](%i3)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL00]], %[[VAL01]]] {permutation_map: [[ID2]]} : vector<3x16xf32>, memref<?x?xf32>
+  // CHECK-NEXT:     %[[VAL10:.*]] = affine.apply [[ID1]](%i2)
+  // CHECK-NEXT:     %[[VAL11:.*]] = affine.apply [[D0P16]](%i3)
+  // CHECK-NEXT:     vector.transfer_write {{.*}}, {{.*}}[%[[VAL10]], %[[VAL11]]] {permutation_map: [[ID2]]} : vector<3x16xf32>, memref<?x?xf32>
  //
  affine.for %i2 = 0 to %M {
    affine.for %i3 = 0 to %N {
@ -53,24 +53,24 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  // CHECK-NEXT:   affine.for %i5 = 0 to %arg1 step 32 {
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     {{.*}} = vector_transfer_read
+  // CHECK-NEXT:     {{.*}} = vector.transfer_read
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<3x16xf32>
  // CHECK-NEXT:     {{.*}} = addf {{.*}} : vector<3x16xf32>
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  // CHECK-NEXT:     {{.*}} = affine.apply
  // CHECK-NEXT:     {{.*}} = affine.apply
-  // CHECK-NEXT:     vector_transfer_write
+  // CHECK-NEXT:     vector.transfer_write
  //
  affine.for %i4 = 0 to %M {
    affine.for %i5 = 0 to %N {
--- a/mlir/test/Transforms/Vectorize/vectorize_1d.mlir
+++ b/mlir/test/Transforms/Vectorize/vectorize_1d.mlir
@ -13,7 +13,7 @@
 // Maps introduced to vectorize fastest varying memory index.
 // CHECK-LABEL: func @vec1d_1
 func @vec1d_1(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
-// CHECK-DAG: [[C0:%[a-z0-9_]+]] = constant 0 : index
+// CHECK-DAG: %[[C0:[a-z0-9_]+]] = constant 0 : index
 // CHECK-DAG: [[ARG_M:%[0-9]+]] = dim %arg0, 0 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_N:%[0-9]+]] = dim %arg0, 1 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_P:%[0-9]+]] = dim %arg1, 2 : memref<?x?x?xf32>
@ -23,7 +23,7 @@ func @vec1d_1(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
   %cst0 = constant 0 : index
 //
 // CHECK: for {{.*}} step 128
-// CHECK-NEXT: {{.*}} = vector_transfer_read %arg0, [[C0]], [[C0]] {permutation_map: #[[map_proj_d0d1_0]]} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+// CHECK-NEXT: {{.*}} = vector.transfer_read %arg0[%[[C0]], %[[C0]]] {permutation_map: #[[map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
   affine.for %i0 = 0 to %M { // vectorized due to scalar -> vector
     %a0 = load %A[%cst0, %cst0] : memref<?x?xf32>
   }
@ -32,7 +32,7 @@ func @vec1d_1(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {

 // CHECK-LABEL: func @vec1d_2
 func @vec1d_2(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
-// CHECK-DAG: [[C0:%[a-z0-9_]+]] = constant 0 : index
+// CHECK-DAG: %[[C0:[a-z0-9_]+]] = constant 0 : index
 // CHECK-DAG: [[ARG_M:%[0-9]+]] = dim %arg0, 0 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_N:%[0-9]+]] = dim %arg0, 1 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_P:%[0-9]+]] = dim %arg1, 2 : memref<?x?x?xf32>
@ -42,8 +42,8 @@ func @vec1d_2(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
   %cst0 = constant 0 : index
 //
 // CHECK:for [[IV3:%[a-zA-Z0-9]+]] = 0 to [[ARG_M]] step 128
-// CHECK-NEXT:   [[APP3:%[a-zA-Z0-9]+]] = affine.apply {{.*}}[[IV3]]
-// CHECK-NEXT:   {{.*}} = vector_transfer_read %arg0, [[C0]], [[APP3]] {permutation_map: #[[map_proj_d0d1_d1]]} : {{.*}} -> vector<128xf32>
+// CHECK-NEXT:   %[[APP3:[a-zA-Z0-9]+]] = affine.apply {{.*}}[[IV3]]
+// CHECK-NEXT:   {{.*}} = vector.transfer_read %arg0[%[[C0]], %[[APP3]]] {permutation_map: #[[map_proj_d0d1_d1]]} : memref<?x?xf32>, vector<128xf32>
   affine.for %i3 = 0 to %M { // vectorized
     %r3 = affine.apply (d0) -> (d0) (%i3)
     %a3 = load %A[%cst0, %r3] : memref<?x?xf32>
@ -53,7 +53,7 @@ func @vec1d_2(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {

 // CHECK-LABEL: func @vec1d_3
 func @vec1d_3(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
-// CHECK-DAG: [[C0:%[a-z0-9_]+]] = constant 0 : index
+// CHECK-DAG: %[[C0:[a-z0-9_]+]] = constant 0 : index
 // CHECK-DAG: [[ARG_M:%[0-9]+]] = dim %arg0, 0 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_N:%[0-9]+]] = dim %arg0, 1 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_P:%[0-9]+]] = dim %arg1, 2 : memref<?x?x?xf32>
@ -64,9 +64,9 @@ func @vec1d_3(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
 //
 // CHECK:for [[IV8:%[i0-9]+]] = 0 to [[ARG_M]] step 128
 // CHECK-NEXT:   for [[IV9:%[i0-9]*]] = 0 to [[ARG_N]] {
-// CHECK-NEXT:   [[APP9_0:%[0-9]+]] = affine.apply {{.*}}([[IV8]], [[IV9]])
-// CHECK-NEXT:   [[APP9_1:%[0-9]+]] = affine.apply {{.*}}([[IV8]], [[IV9]])
-// CHECK-NEXT:   {{.*}} = vector_transfer_read %arg0, [[APP9_0]], [[APP9_1]] {permutation_map: #[[map_proj_d0d1_d1]]} : {{.*}} -> vector<128xf32>
+// CHECK-NEXT:   %[[APP9_0:[0-9]+]] = affine.apply {{.*}}([[IV8]], [[IV9]])
+// CHECK-NEXT:   %[[APP9_1:[0-9]+]] = affine.apply {{.*}}([[IV8]], [[IV9]])
+// CHECK-NEXT:   {{.*}} = vector.transfer_read %arg0[%[[APP9_0]], %[[APP9_1]]] {permutation_map: #[[map_proj_d0d1_d1]]} : memref<?x?xf32>, vector<128xf32>
   affine.for %i8 = 0 to %M { // vectorized
     affine.for %i9 = 0 to %N {
       %r90 = affine.apply (d0, d1) -> (d1) (%i8, %i9)
@ -87,7 +87,7 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  affine.for %i0 = 0 to %M {
    affine.for %i1 = 0 to %N {
      // CHECK: [[C1:%.*]] = constant splat<vector<128xf32>, 1.000000e+00> : vector<128xf32>
-      // CHECK: vector_transfer_write [[C1]], {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : vector<128xf32>, memref<?x?xf32>, index, index
+      // CHECK: vector.transfer_write [[C1]], {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : vector<128xf32>, memref<?x?xf32>
      // non-scoped %f1
      store %f1, %A[%i0, %i1] : memref<?x?xf32, 0>
    }
@ -95,22 +95,22 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  affine.for %i2 = 0 to %M {
    affine.for %i3 = 0 to %N {
      // CHECK: [[C3:%.*]] = constant splat<vector<128xf32>, 2.000000e+00> : vector<128xf32>
-      // CHECK: vector_transfer_write [[C3]], {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : vector<128xf32>, memref<?x?xf32>, index, index
+      // CHECK: vector.transfer_write [[C3]], {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : vector<128xf32>, memref<?x?xf32>
      // non-scoped %f2
      store %f2, %B[%i2, %i3] : memref<?x?xf32, 0>
    }
  }
  affine.for %i4 = 0 to %M {
    affine.for %i5 = 0 to %N {
-      // CHECK: [[A5:%.*]] = vector_transfer_read %0, {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : (memref<?x?xf32>, index, index) -> vector<128xf32>
-      // CHECK: [[B5:%.*]] = vector_transfer_read %1, {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : (memref<?x?xf32>, index, index) -> vector<128xf32>
+      // CHECK: [[A5:%.*]] = vector.transfer_read %0[{{.*}}] {permutation_map: #[[map_proj_d0d1_d1]]} : memref<?x?xf32>, vector<128xf32>
+      // CHECK: [[B5:%.*]] = vector.transfer_read %1[{{.*}}] {permutation_map: #[[map_proj_d0d1_d1]]} : memref<?x?xf32>, vector<128xf32>
      // CHECK: [[S5:%.*]] = addf [[A5]], [[B5]] : vector<128xf32>
      // CHECK: [[SPLAT1:%.*]] = constant splat<vector<128xf32>, 1.000000e+00> : vector<128xf32>
      // CHECK: [[S6:%.*]] = addf [[S5]], [[SPLAT1]] : vector<128xf32>
      // CHECK: [[SPLAT2:%.*]] = constant splat<vector<128xf32>, 2.000000e+00> : vector<128xf32>
      // CHECK: [[S7:%.*]] = addf [[S5]], [[SPLAT2]] : vector<128xf32>
      // CHECK: [[S8:%.*]] = addf [[S7]], [[S6]] : vector<128xf32>
-      // CHECK: vector_transfer_write [[S8]], {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : vector<128xf32>, memref<?x?xf32>, index, index
+      // CHECK: vector.transfer_write [[S8]], {{.*}} {permutation_map: #[[map_proj_d0d1_d1]]} : vector<128xf32>, memref<?x?xf32>
      %a5 = load %A[%i4, %i5] : memref<?x?xf32, 0>
      %b5 = load %B[%i4, %i5] : memref<?x?xf32, 0>
      %s5 = addf %a5, %b5 : f32
@ -179,9 +179,9 @@ func @vec_rejected_3(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
 //
 // CHECK:for [[IV4:%[i0-9]+]] = 0 to [[ARG_M]] step 128 {
 // CHECK-NEXT:   for [[IV5:%[i0-9]*]] = 0 to [[ARG_N]] {
-// CHECK-NEXT:   [[APP50:%[0-9]+]] = affine.apply {{.*}}([[IV4]], [[IV5]])
-// CHECK-NEXT:   [[APP51:%[0-9]+]] = affine.apply {{.*}}([[IV4]], [[IV5]])
-// CHECK-NEXT:   {{.*}} = vector_transfer_read %arg0, [[APP50]], [[APP51]] {permutation_map: #[[map_proj_d0d1_d1]]} : {{.*}} -> vector<128xf32>
+// CHECK-NEXT:   %[[APP50:[0-9]+]] = affine.apply {{.*}}([[IV4]], [[IV5]])
+// CHECK-NEXT:   %[[APP51:[0-9]+]] = affine.apply {{.*}}([[IV4]], [[IV5]])
+// CHECK-NEXT:   {{.*}} = vector.transfer_read %arg0[%[[APP50]], %[[APP51]]] {permutation_map: #[[map_proj_d0d1_d1]]} : memref<?x?xf32>, vector<128xf32>
   affine.for %i4 = 0 to %M { // vectorized
     affine.for %i5 = 0 to %N { // not vectorized, would vectorize with --test-fastest-varying=1
       %r50 = affine.apply (d0, d1) -> (d1) (%i4, %i5)
@ -289,7 +289,7 @@ func @vec_rejected_7(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {

 // CHECK-LABEL: func @vec_rejected_8
 func @vec_rejected_8(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
-// CHECK-DAG: [[C0:%[a-z0-9_]+]] = constant 0 : index
+// CHECK-DAG: %[[C0:[a-z0-9_]+]] = constant 0 : index
 // CHECK-DAG: [[ARG_M:%[0-9]+]] = dim %arg0, 0 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_N:%[0-9]+]] = dim %arg0, 1 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_P:%[0-9]+]] = dim %arg1, 2 : memref<?x?x?xf32>
@ -300,7 +300,7 @@ func @vec_rejected_8(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
 //
 // CHECK: affine.for %i{{[0-9]*}} = 0 to %{{[0-9]*}} {
 // CHECK:   for [[IV18:%[a-zA-Z0-9]+]] = 0 to [[ARG_M]] step 128
-// CHECK:     {{.*}} = vector_transfer_read %arg0, [[C0]], [[C0]] {permutation_map: #[[map_proj_d0d1_0]]} : {{.*}} -> vector<128xf32>
+// CHECK:     {{.*}} = vector.transfer_read %arg0[%[[C0]], %[[C0]]] {permutation_map: #[[map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
   affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %i18 in DFS post-order prevents vectorizing %i17
     affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
       %a18 = load %A[%cst0, %cst0] : memref<?x?xf32>
@ -311,7 +311,7 @@ func @vec_rejected_8(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {

 // CHECK-LABEL: func @vec_rejected_9
 func @vec_rejected_9(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
-// CHECK-DAG: [[C0:%[a-z0-9_]+]] = constant 0 : index
+// CHECK-DAG: %[[C0:[a-z0-9_]+]] = constant 0 : index
 // CHECK-DAG: [[ARG_M:%[0-9]+]] = dim %arg0, 0 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_N:%[0-9]+]] = dim %arg0, 1 : memref<?x?xf32>
 // CHECK-DAG: [[ARG_P:%[0-9]+]] = dim %arg1, 2 : memref<?x?x?xf32>
@ -322,7 +322,7 @@ func @vec_rejected_9(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
 //
 // CHECK: affine.for %i{{[0-9]*}} = 0 to %{{[0-9]*}} {
 // CHECK:   for [[IV18:%[a-zA-Z0-9]+]] = 0 to [[ARG_M]] step 128
-// CHECK:     {{.*}} = vector_transfer_read %arg0, [[C0]], [[C0]] {permutation_map: #[[map_proj_d0d1_0]]} : {{.*}} -> vector<128xf32>
+// CHECK:     {{.*}} = vector.transfer_read %arg0[%[[C0]], %[[C0]]] {permutation_map: #[[map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
   affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %i18 in DFS post-order prevents vectorizing %i17
     affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
       %a18 = load %A[%cst0, %cst0] : memref<?x?xf32>
--- a/mlir/test/Transforms/Vectorize/vectorize_2d.mlir
+++ b/mlir/test/Transforms/Vectorize/vectorize_2d.mlir
@ -22,7 +22,7 @@ func @vec2d(%A : memref<?x?x?xf32>) {
   // affine.for %i0 = 0 to %0 {
   //   affine.for %i1 = 0 to %1 step 32 {
   //     affine.for %i2 = 0 to %2 step 256 {
-   //       %3 = "vector_transfer_read"(%arg0, %i0, %i1, %i2) : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   //       %3 = "vector.transfer_read"(%arg0, %i0, %i1, %i2) : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
   affine.for %i0 = 0 to %M {
     affine.for %i1 = 0 to %N {
       affine.for %i2 = 0 to %P {
@ -54,7 +54,7 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  affine.for %i0 = 0 to %M {
    affine.for %i1 = 0 to %N {
      // CHECK: [[C1:%.*]] = constant splat<vector<32x256xf32>, 1.000000e+00> : vector<32x256xf32>
-      // CHECK: vector_transfer_write [[C1]], {{.*}} {permutation_map: #[[map_id2]]} : vector<32x256xf32>, memref<?x?xf32>, index, index
+      // CHECK: vector.transfer_write [[C1]], {{.*}} {permutation_map: #[[map_id2]]} : vector<32x256xf32>, memref<?x?xf32>
      // non-scoped %f1
      store %f1, %A[%i0, %i1] : memref<?x?xf32, 0>
    }
@ -62,22 +62,22 @@ func @vector_add_2d(%M : index, %N : index) -> f32 {
  affine.for %i2 = 0 to %M {
    affine.for %i3 = 0 to %N {
      // CHECK: [[C3:%.*]] = constant splat<vector<32x256xf32>, 2.000000e+00> : vector<32x256xf32>
-      // CHECK: vector_transfer_write [[C3]], {{.*}} {permutation_map: #[[map_id2]]}  : vector<32x256xf32>, memref<?x?xf32>, index, index
+      // CHECK: vector.transfer_write [[C3]], {{.*}} {permutation_map: #[[map_id2]]}  : vector<32x256xf32>, memref<?x?xf32>
      // non-scoped %f2
      store %f2, %B[%i2, %i3] : memref<?x?xf32, 0>
    }
  }
  affine.for %i4 = 0 to %M {
    affine.for %i5 = 0 to %N {
-      // CHECK: [[A5:%.*]] = vector_transfer_read %0, {{.*}} {permutation_map: #[[map_id2]]} : (memref<?x?xf32>, index, index) -> vector<32x256xf32>
-      // CHECK: [[B5:%.*]] = vector_transfer_read %1, {{.*}} {permutation_map: #[[map_id2]]} : (memref<?x?xf32>, index, index) -> vector<32x256xf32>
+      // CHECK: [[A5:%.*]] = vector.transfer_read %0[{{.*}}] {permutation_map: #[[map_id2]]} : memref<?x?xf32>, vector<32x256xf32>
+      // CHECK: [[B5:%.*]] = vector.transfer_read %1[{{.*}}] {permutation_map: #[[map_id2]]} : memref<?x?xf32>, vector<32x256xf32>
      // CHECK: [[S5:%.*]] = addf [[A5]], [[B5]] : vector<32x256xf32>
      // CHECK: [[SPLAT1:%.*]] = constant splat<vector<32x256xf32>, 1.000000e+00> : vector<32x256xf32>
      // CHECK: [[S6:%.*]] = addf [[S5]], [[SPLAT1]] : vector<32x256xf32>
      // CHECK: [[SPLAT2:%.*]] = constant splat<vector<32x256xf32>, 2.000000e+00> : vector<32x256xf32>
      // CHECK: [[S7:%.*]] = addf [[S5]], [[SPLAT2]] : vector<32x256xf32>
      // CHECK: [[S8:%.*]] = addf [[S7]], [[S6]] : vector<32x256xf32>
-      // CHECK: vector_transfer_write [[S8]], {{.*}} {permutation_map: #[[map_id2]]} : vector<32x256xf32>, memref<?x?xf32>, index, index
+      // CHECK: vector.transfer_write [[S8]], {{.*}} {permutation_map: #[[map_id2]]} : vector<32x256xf32>, memref<?x?xf32>
      //
      %a5 = load %A[%i4, %i5] : memref<?x?xf32, 0>
      %b5 = load %B[%i4, %i5] : memref<?x?xf32, 0>
@ -110,7 +110,7 @@ func @vectorize_matmul(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg2: me
  //      VECT: {{.*}} #[[map_id1]](%[[M]]) step 4 {
  // VECT-NEXT:   {{.*}} #[[map_id1]](%[[N]]) step 8 {
  //      VECT:     %[[VC0:.*]] = constant splat<vector<4x8xf32>, 0.000000e+00> : vector<4x8xf32>
-  // VECT-NEXT:     vector_transfer_write %[[VC0]], %arg2, %{{.*}}, %{{.*}} {permutation_map: #[[map_id2]]}
+  // VECT-NEXT:     vector.transfer_write %[[VC0]], %arg2[%{{.*}}, %{{.*}}] {permutation_map: #[[map_id2]]} : vector<4x8xf32>, memref<?x?xf32>
  affine.for %i0 = (d0) -> (d0)(%c0) to (d0) -> (d0)(%M) {
    affine.for %i1 = (d0) -> (d0)(%c0) to (d0) -> (d0)(%N) {
      %cst = constant 0.000000e+00 : f32
@ -120,12 +120,12 @@ func @vectorize_matmul(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg2: me
  //      VECT:  affine.for %[[I2:.*]] = #[[map_id1]](%[[C0]]) to #[[map_id1]](%[[M]]) step 4 {
  // VECT-NEXT:    affine.for %[[I3:.*]] = #[[map_id1]](%[[C0]]) to #[[map_id1]](%[[N]]) step 8 {
  // VECT-NEXT:      affine.for %[[I4:.*]] = #map5(%[[C0]]) to #[[map_id1]](%[[K]]) {
-  // VECT-NEXT:        %[[A:.*]] = vector_transfer_read %arg1, %[[I4]], %[[I3]] {permutation_map: #[[map_proj_d0d1_zerod1]]} 
-  // VECT-NEXT:        %[[B:.*]] = vector_transfer_read %arg0, %[[I2]], %[[I4]] {permutation_map: #[[map_proj_d0d1_d0zero]]} 
+  // VECT-NEXT:        %[[A:.*]] = vector.transfer_read %arg1[%[[I4]], %[[I3]]] {permutation_map: #[[map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
+  // VECT-NEXT:        %[[B:.*]] = vector.transfer_read %arg0[%[[I2]], %[[I4]]] {permutation_map: #[[map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
  // VECT-NEXT:        %[[C:.*]] = mulf %[[B]], %[[A]] : vector<4x8xf32>
-  // VECT-NEXT:        %[[D:.*]] = vector_transfer_read %arg2, %[[I2]], %[[I3]] {permutation_map: #[[map_id2]]} 
+  // VECT-NEXT:        %[[D:.*]] = vector.transfer_read %arg2[%[[I2]], %[[I3]]] {permutation_map: #[[map_id2]]} : memref<?x?xf32>, vector<4x8xf32>
  // VECT-NEXT:        %[[E:.*]] = addf %[[D]], %[[C]] : vector<4x8xf32>
-  // VECT-NEXT:        vector_transfer_write %[[E]], %arg2, %[[I2]], %[[I3]] {permutation_map: #[[map_id2]]} : vector<4x8xf32>, memref<?x?xf32>, index, index
+  // VECT-NEXT:        vector.transfer_write %[[E]], %arg2[%[[I2]], %[[I3]]] {permutation_map: #[[map_id2]]} : vector<4x8xf32>, memref<?x?xf32>
  affine.for %i2 = (d0) -> (d0)(%c0) to (d0) -> (d0)(%M) {
    affine.for %i3 = (d0) -> (d0)(%c0) to (d0) -> (d0)(%N) {
      affine.for %i4 = (d0) -> (d0)(%c0) to (d0) -> (d0)(%K) {
--- a/mlir/test/Transforms/Vectorize/vectorize_3d.mlir
+++ b/mlir/test/Transforms/Vectorize/vectorize_3d.mlir
@ -12,7 +12,7 @@ func @vec3d(%A : memref<?x?x?xf32>) {
   // CHECK:     affine.for %i2 = 0 to %0 step 32 {
   // CHECK:       affine.for %i3 = 0 to %1 step 64 {
   // CHECK:         affine.for %i4 = 0 to %2 step 256 {
-   // CHECK:           %3 = vector_transfer_read %arg0, %i2, %i3, %i4 {permutation_map: #[[map_proj_d0d1d2_d0d1d2]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x64x256xf32>
+   // CHECK:           %3 = vector.transfer_read %arg0[%i2, %i3, %i4] {permutation_map: #[[map_proj_d0d1d2_d0d1d2]]} : memref<?x?x?xf32>, vector<32x64x256xf32>
   affine.for %t0 = 0 to %0 {
     affine.for %t1 = 0 to %0 {
       affine.for %i0 = 0 to %0 {
--- a/mlir/test/Transforms/Vectorize/vectorize_outer_loop_2d.mlir
+++ b/mlir/test/Transforms/Vectorize/vectorize_outer_loop_2d.mlir
@ -10,7 +10,7 @@ func @vec2d(%A : memref<?x?x?xf32>) {
   // CHECK: affine.for %i0 = 0 to %0 step 32
   // CHECK:   affine.for %i1 = 0 to %1 {
   // CHECK:     affine.for %i2 = 0 to %2 step 256
-   // CHECK:       {{.*}} = vector_transfer_read %arg0, %i0, %i1, %i2 {permutation_map: #[[map_proj_d0d1d2_d0d2]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       {{.*}} = vector.transfer_read %arg0[%i0, %i1, %i2] {permutation_map: #[[map_proj_d0d1d2_d0d2]]} : memref<?x?x?xf32>,  vector<32x256xf32>
   affine.for %i0 = 0 to %M {
     affine.for %i1 = 0 to %N {
       affine.for %i2 = 0 to %P {
--- a/mlir/test/Transforms/Vectorize/vectorize_outer_loop_transpose_2d.mlir
+++ b/mlir/test/Transforms/Vectorize/vectorize_outer_loop_transpose_2d.mlir
@ -22,7 +22,7 @@ func @vec2d(%A : memref<?x?x?xf32>) {
   // CHECK: affine.for %i3 = 0 to %0 step 32
   // CHECK:   affine.for %i4 = 0 to %1 step 256
   // CHECK:     affine.for %i5 = 0 to %2 {
-   // CHECK:       {{.*}} = vector_transfer_read %arg0, %i4, %i5, %i3 {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       {{.*}} = vector.transfer_read %arg0[%i4, %i5, %i3] {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
   affine.for %i3 = 0 to %M {
     affine.for %i4 = 0 to %N {
       affine.for %i5 = 0 to %P {
@ -40,12 +40,12 @@ func @vec2d_imperfectly_nested(%A : memref<?x?x?xf32>) {
   // CHECK: affine.for %i0 = 0 to %0 step 32 {
   // CHECK:   affine.for %i1 = 0 to %1 {
   // CHECK:     affine.for %i2 = 0 to %2 step 256 {
-   // CHECK:       %3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       %3 = vector.transfer_read %arg0[%i2, %i1, %i0] {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
   // CHECK:   affine.for %i3 = 0 to %1 step 256 {
   // CHECK:     affine.for %i4 = 0 to %2 {
-   // CHECK:       %4 = vector_transfer_read %arg0, %i3, %i4, %i0 {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       %4 = vector.transfer_read %arg0[%i3, %i4, %i0] {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
   // CHECK:     affine.for %i5 = 0 to %2 {
-   // CHECK:       %5 = vector_transfer_read %arg0, %i3, %i5, %i0 {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       %5 = vector.transfer_read %arg0[%i3, %i5, %i0] {permutation_map: #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
   affine.for %i0 = 0 to %0 {
     affine.for %i1 = 0 to %1 {
       affine.for %i2 = 0 to %2 {
--- a/mlir/test/Transforms/Vectorize/vectorize_transpose_2d.mlir
+++ b/mlir/test/Transforms/Vectorize/vectorize_transpose_2d.mlir
@ -22,7 +22,7 @@ func @vec2d(%A : memref<?x?x?xf32>) {
   // CHECK: affine.for %i3 = 0 to %0 step 32
   // CHECK:   affine.for %i4 = 0 to %1 {
   // CHECK:     affine.for %i5 = 0 to %2 step 256
-   // CHECK:       {{.*}} = vector_transfer_read %arg0, %i4, %i5, %i3 {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       {{.*}} = vector.transfer_read %arg0[%i4, %i5, %i3] {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
   affine.for %i3 = 0 to %M {
     affine.for %i4 = 0 to %N {
       affine.for %i5 = 0 to %P {
@ -40,12 +40,12 @@ func @vec2d_imperfectly_nested(%A : memref<?x?x?xf32>) {
   // CHECK: affine.for %i0 = 0 to %0 step 32 {
   // CHECK:   affine.for %i1 = 0 to %1 step 256 {
   // CHECK:     affine.for %i2 = 0 to %2 {
-   // CHECK:       %3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       %3 = vector.transfer_read %arg0[%i2, %i1, %i0] {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
   // CHECK:   affine.for %i3 = 0 to %1 {
   // CHECK:     affine.for %i4 = 0 to %2 step 256 {
-   // CHECK:       %4 = vector_transfer_read %arg0, %i3, %i4, %i0 {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       %4 = vector.transfer_read %arg0[%i3, %i4, %i0] {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
   // CHECK:     affine.for %i5 = 0 to %2 step 256 {
-   // CHECK:       %5 = vector_transfer_read %arg0, %i3, %i5, %i0 {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
+   // CHECK:       %5 = vector.transfer_read %arg0[%i3, %i5, %i0] {permutation_map: #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
   affine.for %i0 = 0 to %0 {
     affine.for %i1 = 0 to %1 {
       affine.for %i2 = 0 to %2 {