Enum OperationType
Represents different operation types in computation graphs for JIT compilation and automatic differentiation.
public enum OperationType
Fields
Abs = 8Element-wise absolute value - |x| for each element.
Activation = 35Generic activation function application.
AdaptivePooling = 146Adaptive pooling.
Add = 2Element-wise addition of two tensors.
AffineGrid = 64Affine grid generation for spatial transformers.
And = 168Logical AND.
AnomalyScore = 107Anomaly score computation.
Attention = 130Generic attention mechanism operation.
AveragePooling = 143Average pooling operation.
AvgPool2D = 602D average pooling.
BatchNorm = 62Batch normalization.
BatchNormalization = 138Batch normalization.
BentIdentity = 32Bent Identity - (sqrt(x² + 1) - 1) / 2 + x, smooth alternative to ReLU.
Broadcast = 129Broadcast operation - expands tensor dimensions to match target shape.
CELU = 30Continuously Differentiable ELU - max(0, x) + min(0, α * (exp(x/α) - 1)).
CRFForward = 106CRF forward algorithm for sequence labeling.
Cast = 172Type cast operation.
CliffordInnerProduct = 80Inner (contraction) product of multivectors.
Clip = 173Clip values to range.
ComplexMatMul = 72Complex matrix multiplication for quantum operations.
ComplexMultiply = 73Element-wise complex multiplication.
Concat = 44Concatenate multiple tensors along an axis.
Constant = 1Constant node - represents a constant value that doesn't require gradients.
Conv2D = 522D convolution operation.
ConvTranspose2D = 532D transposed convolution (deconvolution).
Convolution = 132General convolution operation.
Convolution2D = 1332D convolution operation.
Convolution3D = 1343D convolution operation.
Crop = 46Crop tensor by removing border elements.
CrossAttention = 152Cross-attention operation.
Custom = 126Custom user-defined operation for extensibility.
Deconvolution = 137Deconvolution (transposed convolution) operation.
DeformableConv2D = 572D deformable convolution with learnable offsets and optional modulation.
Dense = 147Dense (fully connected) layer.
DepthwiseConv2D = 552D depthwise convolution.
DepthwiseConvolution = 135Depthwise convolution operation.
DilatedConv2D = 542D dilated (atrous) convolution.
DilatedConvolution = 136Dilated convolution operation.
Divide = 5Element-wise division of two tensors.
DropPath = 160DropPath regularization.
Dropout = 127Dropout regularization operation - randomly zeros elements during training.
ELU = 20Exponential Linear Unit - ELU(x) = x if x > 0, alpha * (exp(x) - 1) otherwise.
Embedding = 67Embedding lookup operation.
Equal = 163Element-wise equality.
Exp = 9Element-wise exponential function - e^x for each element.
Expand = 159Expand tensor dimensions.
FakeQuantization = 125Fake quantization operation with Straight-Through Estimator (STE) for differentiable quantization. Forward: quantized = round(x / scale) * scale Backward: gradient passes through unchanged (STE)
Flatten = 156Flatten tensor to 1D.
FullyConnected = 148Fully connected layer.
FusedAddReLU = 100Fused addition + ReLU.
FusedConvBatchNorm = 99Fused convolution + batch normalization.
FusedConvBatchNormReLU = 175Fused Conv + BatchNorm + ReLU.
FusedLayerNormAttention = 180Fused LayerNorm + Attention.
FusedLinearReLU = 98Fused linear layer with ReLU (MatMul + Add + ReLU).
FusedMatMulAdd = 97Fused matrix multiplication + addition (MatMul + Add).
FusedMatMulBias = 176Fused MatMul + Bias.
FusedMatMulBiasGELU = 178Fused MatMul + Bias + GELU.
FusedMatMulBiasReLU = 177Fused MatMul + Bias + ReLU.
FusedMultiHeadAttention = 179Fused MultiHead Attention.
GELU = 22Gaussian Error Linear Unit - x * Φ(x) where Φ is standard normal CDF.
GRU = 154GRU recurrent layer.
GRUCell = 70GRU cell operation for recurrent networks.
Gather = 128Gather operation - selects elements from a tensor using indices.
Gaussian = 33Gaussian activation - exp(-x²), bell-shaped response curve.
Gemm = 149General Matrix Multiplication.
GeometricProduct = 78Geometric product of multivectors.
GlobalAveragePooling = 144Global average pooling.
GlobalMaxPooling = 145Global max pooling.
GradeProject = 83Grade projection of multivectors.
GraphConv = 66Graph convolutional operation for GNNs.
Greater = 164Element-wise greater than.
GreaterOrEqual = 166Element-wise greater or equal.
GridSample = 65Grid sampling for spatial transformers.
GroupNormalization = 141Group normalization.
GumbelSoftmax = 101Gumbel-Softmax for differentiable discrete sampling (used in stochastic layers).
HardSigmoid = 27Hard Sigmoid - piecewise linear approximation of sigmoid: clip((x + 1) / 2, 0, 1).
HardTanh = 28Hard Tanh - piecewise linear approximation of tanh: clip(x, -1, 1).
HierarchicalSoftmax = 121Hierarchical Softmax - tree-based efficient softmax for large vocabularies.
HyperboloidDistance = 88Hyperboloid distance metric.
ISRU = 110Inverse Square Root Unit - x / sqrt(1 + alpha * x²).
Input = 0Input node - represents a variable or parameter in the computation graph.
InstanceNormalization = 140Instance normalization.
LSTM = 153LSTM recurrent layer.
LSTMCell = 71LSTM cell operation for recurrent networks.
LayerNorm = 61Layer normalization.
LayerNormalization = 139Layer normalization.
LeakyReLU = 21Leaky Rectified Linear Unit - max(alpha * x, x) where alpha is typically 0.01.
LeakyStateUpdate = 105Leaky state update for reservoir/echo state networks.
Less = 165Element-wise less than.
LessOrEqual = 167Element-wise less or equal.
LiSHT = 31Linearly Scaled Hyperbolic Tangent - x * tanh(x).
LocallyConnectedConv2D = 562D locally connected convolution.
Log = 10Element-wise natural logarithm.
LogSoftmax = 112Log-Softmax - log(softmax(x)), numerically stable for cross-entropy loss.
LogSoftmin = 114Log-Softmin - log(softmin(x)) = log(softmax(-x)).
MatMul = 14Matrix multiplication (not element-wise).
MaxPool2D = 582D max pooling.
MaxPool3D = 593D max pooling.
MaxPooling = 142Max pooling operation.
Maxout = 116Maxout activation - maximum over multiple linear pieces.
Mean = 41Mean operation (reduces all dimensions).
Mish = 24Mish activation - x * tanh(softplus(x)).
MobiusAdd = 84Mobius addition in Poincare ball model.
MultiHeadAttention = 69Multi-head attention operation.
Multiply = 4Element-wise multiplication (Hadamard product) of two tensors.
MultivectorAdd = 81Multivector addition.
MultivectorReverse = 82Multivector reverse operation.
Negate = 7Element-wise negation - multiplies each element by -1.
Norm = 13L2 norm computation along an axis - sqrt(sum(x²)).
Not = 170Logical NOT.
OctonionAdd = 77Octonion addition.
OctonionConjugate = 76Octonion conjugation.
OctonionMatMul = 75Octonion matrix multiplication for neural networks.
OctonionMultiply = 74Octonion multiplication (non-associative).
Or = 169Logical OR.
Output = 131Output node in computation graph.
PReLU = 108Parametric Rectified Linear Unit - max(0, x) + alpha * min(0, x) where alpha is learned.
Pad = 45Pad tensor with values.
Permute = 43Permute tensor dimensions (general transpose).
PixelShuffle = 51Pixel shuffle operation for upsampling.
PoincareDistance = 87Poincare ball distance metric.
PoincareExpMap = 85Poincare exponential map.
PoincareLogMap = 86Poincare logarithmic map.
PositionalEncoding = 161Positional encoding for transformers.
Power = 6Element-wise power operation - raises each element to a specified exponent.
RBFKernel = 63RBF (Radial Basis Function) kernel operation.
RNN = 155Basic RNN layer.
RReLU = 117Randomized Leaky ReLU - LeakyReLU with random alpha during training.
ReLU = 16Rectified Linear Unit - max(0, x).
ReduceLogVariance = 40Log-variance reduction along specified axes.
ReduceMax = 39Maximum value reduction along specified axes.
ReduceMean = 38Mean reduction along specified axes.
ReduceMin = 150Minimum value reduction.
ReduceSum = 37Sum reduction along specified axes.
Reshape = 42Reshape tensor to new dimensions.
SELU = 26Scaled Exponential Linear Unit - self-normalizing activation with fixed lambda and alpha.
SQRBF = 115Square Radial Basis Function - smooth bell-shaped activation.
ScaledDotProductAttention = 68Scaled dot-product attention.
ScaledTanh = 34Scaled Tanh - parameterized tanh with adjustable steepness β.
Scatter = 174Scatter values to indices.
Se3Exp = 95SE(3) exponential map.
Se3Log = 96SE(3) logarithmic map.
SelfAttention = 151Self-attention operation.
Sigmoid = 17Sigmoid activation - 1 / (1 + e^(-x)).
Sign = 111Sign function with surrogate gradient for training - returns -1, 0, or 1.
Slice = 48Slice tensor along an axis - extract a portion with optional stride.
So3Exp = 93SO(3) exponential map.
So3Log = 94SO(3) logarithmic map.
SoftKNN = 123Soft K-Nearest Neighbors operation for differentiable instance-based learning. Uses attention-weighted contributions from all support vectors instead of hard k-selection. weights = softmax(-distances / temperature), output = Σ weights * labels
SoftLocallyWeighted = 124Soft locally-weighted regression operation for differentiable instance-based learning. Uses attention-weighted linear combination of training targets based on distance. weights = softmax(-||x - X_train||² / bandwidth), output = weights @ y_train
SoftPlus = 25SoftPlus activation - ln(1 + e^x), smooth approximation of ReLU.
SoftSign = 29SoftSign activation - x / (1 + |x|), alternative to tanh with polynomial tails.
SoftSplit = 122Soft split operation for differentiable decision trees. Uses sigmoid gating: p_left = σ((threshold - x[feature]) / temperature) output = p_left * left_value + (1 - p_left) * right_value
Softmax = 19Softmax activation - converts logits to probability distribution.
Softmin = 113Softmin - softmax(-x), assigns higher probability to lower values.
SpMM = 90Sparse matrix-matrix multiplication.
SpMV = 89Sparse matrix-vector multiplication.
SparseGather = 91Sparse gather operation.
SparseScatter = 92Sparse scatter operation.
Sparsemax = 120Sparsemax - projects onto probability simplex, can produce sparse outputs.
SphericalSoftmax = 118Spherical Softmax - L2 normalization followed by softmax.
Split = 47Split tensor along an axis into multiple tensors.
Sqrt = 11Element-wise square root.
Square = 12Element-wise square - x² for each element.
Squash = 36Squashing activation for capsule networks - s(v) = ||v||² / (1 + ||v||²) * (v / ||v||).
Squeeze = 157Remove dimensions of size 1.
Stack = 162Stack tensors along new axis.
StraightThroughThreshold = 103Straight-through threshold for HTM-style sparse activations.
Subtract = 3Element-wise subtraction of two tensors.
SurrogateSpike = 102Surrogate spike function for spiking neural networks with gradient estimation.
Swish = 23Swish/SiLU activation - x * sigmoid(x).
Tanh = 18Hyperbolic tangent activation.
TaylorSoftmax = 119Taylor Softmax - softmax using Taylor series approximation of exp.
ThresholdedReLU = 109Thresholded Rectified Linear Unit - x if x > threshold, 0 otherwise.
TopKSoftmax = 104Top-K softmax for mixture-of-experts routing.
Transpose = 15Matrix transpose - swaps rows and columns.
Unknown = 181Unknown operation type.
Unsqueeze = 158Add dimension of size 1.
Upsample = 49Upsample tensor by repeating elements.
Upsample3D = 503D upsampling operation for volumetric data. Increases spatial resolution by repeating or interpolating values in depth, height, and width.
WedgeProduct = 79Wedge (outer) product of multivectors.
Xor = 171Logical XOR.
Remarks
For Beginners: Operation types identify mathematical operations performed on tensors in neural networks.
When building a computation graph, each operation (like adding two tensors or applying an activation function) needs to be identified so that:
- The JIT compiler can optimize the code
- The automatic differentiation system can compute gradients correctly
- The system can analyze and transform the computation graph
This enum provides type-safe identification of operations, preventing typos and enabling better tooling support.