Describe broadcasting rules (#983)

closes #981 --------- Co-authored-by: ABBY CROSS <across@us.ibm.com> Co-authored-by: Jessie Yu <jessieyu@us.ibm.com> Co-authored-by: Elena Peña Tapia <57907331+ElePT@users.noreply.github.com>
2024-03-13 14:27:00 -05:00 · 2024-03-13 14:27:00 -05:00 · dcdde8aed1
parent e1a7e6abfb
commit dcdde8aed1
2 changed files with 1812 additions and 6 deletions
--- a/docs/run/primitives.mdx
+++ b/docs/run/primitives.mdx
@ -83,21 +83,108 @@ The updated interface uses a *primitive unified bloc* (PUB) for input.  Each PUB
 Example: 

 ```python
-estimator.run([circuit1, circuit2, ...],[observable1, observable2, ...],[param_values1, param_values2, ...] )
+estimator.run([circuit1, circuit2, ...],[observable1, observable2, ...],
+  [param_values1, param_values2, ...] )
 ```

 #### Estimator V2

-* Takes one parameter: PUBs in the format (`<circuit>`, `<observables>`, `<parameter values>`, `<precision>`)
-* Numpy [broadcasting rules](https://numpy.org/doc/stable/user/basics.broadcasting.html) are used when combining observables and parameter values.
+* The `run()` method takes an array of PUBs. Each PUB is in the format (`<single circuit>`, `<one or more observables>`, `<optional one or more parameter values>`, `<optional precision>`), where the optional `parameter values` can be a list or a single parameter.
+* Combines elements from observables and parameter values by following NumPy broadcasting rules as described below. 
 * Each input PUB has a corresponding PubResult that contains both data and metadata.

 Example: 

 ```python
-estimator.run([(circuit1, observable1, param_values1),(circuit2, observable2, param_values2) ])
+estimator.run([(circuit1, observable1, param_values1),(circuit2, observable2, param_values2)])
 ```

+<span id="broadcast-rules"></span>
+##### Broadcasting rules
+
+Estimator V2 aggregates elements from multiple arrays (observables and parameter values) by following the same broadcasting rules as NumPy. This section summarizes those rules.  For a detailed explanation, see the [NumPy broadcasting rules documentation.](https://numpy.org/doc/stable/user/basics.broadcasting.html)
+
+Rules:
+
+* Input arrays do not need to have the same number of dimensions. 
+  * The resulting array will have the same number of dimensions as the input array with the largest dimension. 
+  * The size of each dimension is the largest size of the corresponding dimension.
+  * Missing dimensions are assumed to have size one.
+* Shape comparisons start with the rightmost dimension and continue to the left.
+* Two dimensions are compatible if their sizes are equal or if one of them is 1.
+
+Examples of array pairs that broadcast:
+
+```text
+A1     (1d array):      1
+A2     (2d array):  3 x 5
+Result (2d array):  3 x 5
+
+
+A1     (3d array):  11 x 2 x 7
+A2     (3d array):  11 x 1 x 7
+Result (3d array):  11 x 2 x 7
+```
+
+Examples of array pairs that do not broadcast:
+
+```text
+A1     (1d array):  5
+A2     (1d array):  3 
+
+A1     (2d array):      2 x 1
+A2     (3d array):  6 x 5 x 4 # This would work if the middle dimension were 2, but it is 5.
+```
+
+`EstimatorV2` returns one expectation value estimate for each element of the broadcasted shape.  
+
+Here are some examples of common patterns expressed in terms of array broadcasting.  Their accompanying visual representation is shown in the figure that follows:
+
+```python
+# Broadcast single observable
+
+parameter_values = np.random.uniform(size=(5,))  # shape (5,)
+observables = SparsePauliOp("ZZZ")  # shape ()
+>> pub result has shape (5,)
+
+# Zip
+parameter_values = np.random.uniform(size=(5,))  # shape (5,)
+observables = [SparsePauliOp(pauli) for pauli in ["III", "XXX", "YYY", "ZZZ", "XYZ"]]  # shape (5,)
+>> pub result has shape (5,)
+
+# Outer/Product
+parameter_values = np.random.uniform(size=(1, 6))  # shape (1, 6)
+observables = [[SparsePauliOp(pauli)] for pauli in ["III", "XXX", "YYY", "ZZZ"]]  # shape (4, 1)
+>> pub result has shape (4, 6)
+
+# Standard nd generalization
+parameter_values = np.random.uniform(size=(3, 6))  # shape (3, 6)
+observables = [
+    [[SparsePauliOp(['XII'])], [SparsePauliOp(['IXI'])], [SparsePauliOp(['IIX'])]],
+    [[SparsePauliOp(['ZII'])], [SparsePauliOp(['IZI'])], [SparsePauliOp(['IIZ'])]]
+]  # shape (2, 3, 1)
+>> pub result has shape (2, 3, 6)
+```
+
+![Parameter value sets are represented by n x m arrays, and observable arrays are represented by one or more single-column arrays. For each example in the previous code, the parameter value sets are combined with their observable array to create the resulting EV estimates.  Example 1 (broadcast single observable) has a parameter value set that is a 5x1 array and a 1x1 observables array.  The one item in the observables array is combined with each item in the parameter value set to create a single 5x1 array where each item is a combination of the original item in the parameter value set with the item in the observables array.  Example 2 (zip) has a 5x1 parameter value set and a 5x1 observables array.  The output is a 5x1 array where each item is a combination of the nth item in the parameter value set with the nth item in the observables array.  Example 3 (outer/product) has a 1x6 parameter value set and a 4x1 observables array.  Their combination results in a 4x6 array that is created by combining each item in the parameter value set with *every* item in the observables array.  Thus, each parameter value becomes an entire column in the output.  Example 4 (Standard nd generalization) has a 3x6 parameter value set array and two 3x1 observables array.  They combine to create two 3x6 output arrays in a similar manner to the previous example.](/images/run/broadcasting.svg "Visual representation of broadcasting")
+
+<Admonition type="tip" title="SparsePauliOp">
+Each `SparsePauliOp` counts as a single element in this context, regardless of the number of Paulis contained in the `SparsePauliOp`. Thus, for the purpose of these broadcasting rules, all of the following elements have the same shape:
+
+```text
+a = SparsePauliOp("Z") # shape ()
+b = SparsePauliOp("IIIIZXYIZ") # shape ()
+c = SparsePauliOp.from_list(["XX", "XY", "IZ"]) # shape ()
+```
+
+The following lists of operators, while equivalent in terms of information contained, have different shapes:
+
+```text
+list1 = SparsePauliOp.from_list(["XX", "XY", "IZ"]) # shape ()
+list2 = [SparsePauliOp("XX"), SparsePauliOp("XY"), SparsePauliOp("IZ")] # shape (3, )
+```
+</Admonition>
+
 #### Sampler V1

 * Takes two parameters: circuits and parameter values.
@ -113,8 +200,8 @@ sampler.run([circuit1, circuit2, ...],[observable1, observable2, ...],[param_val

 #### Sampler V2

-* Takes one parameter: PUBs in the format  (`<circuit>`, `<parameter values>`, `<shots>`)
-* Elements from each are aggregated.
+* Takes one parameter: PUBs in the format  (`<single circuit>`, `<optional one or more parameter values>`, `<optional shots>`), where there can be multiple `parameter values` items, and each item can be either an array or a single parameter, depending on the chosen circuit.
+* Elements from each are aggregated. For example, each array of parameter values in the PUB is applied to the PUB's circuit.
 * Obeys program outputs. Typically this is a bit array but can also be an array of complex numbers (measurement level 1). 
 * Returns raw data type. Data from each shot is returned (analogous to `memory=True` in the `backend.run` interface), and post-processing is done by using convenience methods.
 * Output data is grouped by output registers.
@ -195,3 +282,4 @@ The Qiskit Runtime primitives provide a more sophisticated implementation (for e
    - Read [Migrate to V2 primitives](/api/migration-guides/v2-primitives).
    - Practice with primitives by working through the [Cost function lesson](https://learning.quantum.ibm.com/course/variational-algorithm-design/cost-functions#primitives) in IBM Quantum&trade; Learning.
 </Admonition>
+
--- a/public/images/run/broadcasting.svg
+++ b/public/images/run/broadcasting.svg