Softmax - 1 vs 13

Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.

Files changed (1) hide show
  1. Softmax1 → Softmax13 +13 -19
Softmax1 → Softmax13 RENAMED
@@ -1 +1 @@
1
- The operator computes the softmax (normalized exponential) values for each layer in the batch
1
+ The operator computes the normalized exponential values for the given input:
2
+ Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)
3
+
2
- of the given input. The input is a 2-D tensor (Tensor<float>) of size
4
+ The "axis" attribute indicates the dimension along which Softmax
3
- (batch_size x input_feature_dimensions). The output tensor has the same shape
5
+ will be performed. The output tensor has the same shape
4
- and contains the softmax values of the corresponding input.
6
+ and contains the Softmax values of the corresponding input.
5
- Input does not need to explicitly be a 2D vector; rather, it will be
6
- coerced into one. For an arbitrary n-dimensional tensor
7
- input in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
8
- the axis provided, then input will be coerced into a 2-dimensional tensor with
9
- dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
10
- case where axis=1, this means the input tensor will be coerced into a 2D tensor
11
- of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
12
- In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
13
- Each of these dimensions must be matched correctly, or else the operator
14
- will throw errors.
15
7
  ### Attributes
16
- * **axis - INT** (default is '1'):
8
+ * **axis - INT** (default is '-1'):
17
- Describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size
9
+ Describes the dimension Softmax will be performed on.
10
+ Negative value means counting dimensions
11
+ from the back. Accepted range is [-r, r-1] where r = rank(input).
18
12
  ### Inputs
19
13
  - **input** (heterogeneous) - **T**:
20
- The input tensor that's coerced into a 2D matrix of size (NxD) as described above.
14
+ The input tensor of rank >= axis.
21
15
  ### Outputs
22
16
  - **output** (heterogeneous) - **T**:
23
- The output values with the same shape as input tensor (the original size without coercion).
17
+ The output values with the same shape as the input tensor.
24
18
  ### Type Constraints
25
- * **T** in ( tensor(double), tensor(float), tensor(float16) ):
19
+ * **T** in ( tensor(bfloat16), tensor(double), tensor(float), tensor(float16) ):
26
20
  Constrain input and output types to float tensors.