Skip to main content

Table 2 The advantages and disadvantages provided by clustering SMS data using either of two different formats

From: Identifying clinical course patterns in SMS data using cluster analysis

  SMS data in original format (all SMS time points used for clustering) SMS data transformed into regression coefficients by spline analysis
Simple and intuitive  
Copes with data that do not show a time trend  
Copes with data from clinical course patterns that are fluctuating  
Copes with clinical course data that are all zero values or the same value at all time points  
Preserves all the original information in the data  
With imputation of missing data, all cases can be included, regardless of clinical course patterns  
Copes better with missing data  
A data reduction technique (reduces the likelihood of overfitting the data)  
Reduces the collinearity (autocorrelation) of the data  
Requires pre-hoc assumptions about which spline characteristics are clinically important. This may improve interpretability but also may introduce bias, and require the exclusion of cases that do not meet those assumptions