TS_Subset(and functions like
TS_CompareFeatureSetsdescribed below allow you to test the sensitivity of these results).
TS_Classifyuses all of the features in a given hctsa data matrix to classify assigned class labels.
cfnParamsstructure. For the labeling defined in a given
TimeSeriestable, you can set defaults for this using
cfnParams = GiveMeDefaultClassificationParams('norm')(takes
HCTSA_N.mat). This automatically sets an appropriate number of folds (for cross-validation), and includes settings for taking into account class imbalance in classifier training and evaluation. It is best to alter the values inside this function to suit your needs, such that these settings can be applied consistently.
HCTSA_N.mat, using default classification settings:
'svm_linear') tend to generalize well, but you can play with the settings in
cfnParamsto get a sense for how the performance varies.
TS_Classifyto iterate over the classification settings defined in
cfnParamsexcept using shuffled class labels. This builds up a null distribution from which you can estimate a p-value to infer the significance of the classification accuracy obtained with the true data labeling provided.
TS_Init), or subsequently (e.g., using
TS_Subset), but you can test the effect such features are having on your dataset using
TS_CompareFeatureSets. Here's an example output:
'notLengthDependent') does not significantly alter the classification accuracy, so these features are not single-handedly driving the classification results. Nevertheless, assuming differences in recording length is not an interesting difference we want to bias our classification results, it would be advisable to remove these for peace of mind.