PCA
Class: NodeImagePCA
Principal component analysis (PCA) is a method that is often used to reduce the dimensionality of data, by transforming a large set of variables into a smaller one that still contain most of the information contained in the full dataset. It can make data easier to explore and visualize. By expressing the dataset in terms of componenents (combination of variables) that contributes most to the variation in the data, the number of variables used to describe the dataset can be reduced. The cost is that some high-frequency information is lost (which can be used as a noise reduction technique).
The figure shows PCA of a 2D dataset. After PCA, component 2 contains very little information and practically all of the variance in the dataset is described by component 1.
The PCA in MICE is based on Extreme Optimization.
Example workflows
Inputs
Image
The input image to be analyzed.
Type: Image4DFloat, Required, Single
Mask
A mask defining which area should be included in the analysis. Must have the same matrix size as the input image.
Type: Image4DBool, Optional, Single
Outputs
Components
The components found using PCA. Will have have the same dimensionality as the input data, i.e. if you input 3 frames of a time series and perform PCA along the T dimension, the number of components will be 3.
Type: Image4DFloat
Prediction
The prediction of the input data using the Number of Components defined in the Node settings.
Type: Image4DFloat
Data
A table containing the eigen values and eigen vectors of the components.
Type: DataCollection
Settings
Scaling Method Selection
When the variables in a PCA analysis use very different scales, the principal components will give more weight to the variable with the larger values. To put all variables on an equal footing, the variables are often scaled. The ScalingMethod property determines if and how this transformation is performed. This value is of type ScalingMethod which can take on the following values:
Property | Description |
---|---|
None | No scaling is performed. |
UnitVariance | The columns are scaled to have unit variance. This is the default. |
VectorNorm | The columns are scaled to have unit norm. |
Pareto | The columns are scaled by the square root of the standard deviation. |
Range | The columns are scaled to have unit range (difference between largest and smallest value). |
Level | The columns are scaled by the column mean. |
Values: None, UnitVariance, VectorNorm, Pareto, Range, Level
Number of Components Integer
The number of components that is used to recreate the prediction of the output data, given the input data. If the number of components is set to the same number as the dimensionality of the input data, the output will equal the input. If set to a lower value, it will contain less noise.
Dimension Selection
Along which image dimension should the PCA be performed.
Values: X, Y, Z, T
Zero Variance Compensation Number
If value scaling is used this value will be added to one element of columns with no variance, otherwise the scaling will fail.
References
Keywords: Principal component analysis, dimensionality reduction, noise reduction, eigen values, eigen vectors
Copyright © 2022, NONPI Medical AB