Definitions: \(x_i\) is \(i^{th}\) data point, \(n\) total number of data points, \(C_k\) is \(k^{th}\) cluster, \(N\) is normal distribution parameterized by \(\mu_k\) and \(\sigma_k\) (mean and cov of \(k^{th}\) cluster), and \(P(C_k | x_i)\) are the expected counts (probability of xi in ck) .
Expectation: \[P(C_k | x_i) = {P(x_i | C_k) \times P(C_k) \over P(x_i)} , P(x_i | C_k) = N(\mu_k, \sigma_k) \]
Maximization: \[ P(C_k) = {\sum_i P(C_k | x_i) \over n}, \mu_k = {\sum_i x_i} \times {P(C_k | x_i) \over \sum_i P(C_k | x_i)} , \sigma_k = {\sum_i (x_i - \mu_k)^2} \times {P(C_k | x_i) \over \sum_i P(C_k | x_i)}\]