PCA

Class: NodeVectorPCA

Node Icon

Principal component analysis (PCA) is a method that is often used to reduce the dimensionality of data, by transforming a large set of variables into a smaller one that still contain most of the information contained in the full dataset. It can make data easier to explore and visualize. By expressing the dataset in terms of componenents (combination of variables) that contributes most to the variation in the data, the number of variables used to describe the dataset can be reduced. The cost is that some high-frequency information is lost (which can be used as a noise reduction technique).

Image PCA

The figure shows PCA of a 2D dataset. After PCA, component 2 contains very little information and practically all of the variance in the dataset is described by component 1.

The PCA in MICE is based on Extreme Optimization.

Inputs

Image

The input image to be analyzed.

Type: Image4DVector3, Required, Single

Mask

A mask defining which area should be included in the analysis. Must have the same matrix size as the input image.

Type: Image4DBool, Optional, Single

Outputs

Components

The components found using PCA. Will have have the same dimensionality as the input data, i.e. if you input 3 frames of a time series and perform PCA along the T dimension, the number of components will be 3.

Type: Image4DVector3

Prediction

The prediction of the input data using the Number of Components defined in the Node settings.

Type: Image4DVector3

Data

A table containing the eigen values and eigen vectors of the components.

Type: DataCollection

Settings

Scaling Method Selection

When the variables in a PCA analysis use very different scales, the principal components will give more weight to the variable with the larger values. To put all variables on an equal footing, the variables are often scaled. The ScalingMethod property determines if and how this transformation is performed. This value is of type ScalingMethod which can take on the following values:

Property Description
None No scaling is performed.
UnitVariance The columns are scaled to have unit variance. This is the default.
VectorNorm The columns are scaled to have unit norm.
Pareto The columns are scaled by the square root of the standard deviation.
Range The columns are scaled to have unit range (difference between largest and smallest value).
Level The columns are scaled by the column mean.

Values: None, UnitVariance, VectorNorm, Pareto, Range, Level

Number of Components Integer

The number of components that is used to recreate the prediction of the output data, given the input data. If the number of components is set to the same number as the dimensionality of the input data, the output will equal the input. If set to a lower value, it will contain less noise.

Dimension Selection

Along which image dimension should the PCA be performed.

Values: X, Y, Z, T

Zero Variance Compensation Number

If value scaling is used this value will be added to one element of columns with no variance, otherwise the scaling will fail.

References

Keywords: Principal component analysis, dimensionality reduction, noise reduction, eigen values, eigen vectors