Skip to main content

Table 2 The advantages and disadvantages provided by clustering SMS data using either of two different formats

From: Identifying clinical course patterns in SMS data using cluster analysis

 

SMS data in original format (all SMS time points used for clustering)

SMS data transformed into regression coefficients by spline analysis

Simple and intuitive

 

Copes with data that do not show a time trend

 

Copes with data from clinical course patterns that are fluctuating

 

Copes with clinical course data that are all zero values or the same value at all time points

 

Preserves all the original information in the data

 

With imputation of missing data, all cases can be included, regardless of clinical course patterns

 

Copes better with missing data

 

A data reduction technique (reduces the likelihood of overfitting the data)

 

Reduces the collinearity (autocorrelation) of the data

 

Requires pre-hoc assumptions about which spline characteristics are clinically important. This may improve interpretability but also may introduce bias, and require the exclusion of cases that do not meet those assumptions