Technical note Open Access
Abstract: Updating basic running statistics often requires keeping track of cumulants as the data set grows or evolves through time. Common approaches that are based on running cumulants are inefficient for intensive-processing or Big data contexts, since they introduce running windows to the original data set, in order to avoid arithmetic overflows. This work formulates recursive estimators for arithmetic mean and (sample) standard deviation of Normal distributions, which are of minimum storage complexity (one-step-back), arithmetically robust for error resiliency and inherently parallelizable at the lowest possible level (arithmetic operators).
Keywords: online statistics, parallel processing, Big data