Monday, March 13, 2017

The one sample t-test (coefficient significance test)

In the one sample t-test we are testing whether population mean parameter μ equals some value μ0:


H0: μ=μ0
H1: μ<>μ0


Given sample of independent measurements xi for i=1..n with sample mean x̅ and sample standard deviation s, the t statistics is defined as:


The t statistics follows Student’s t-distribution with n-1 degrees of freedom under the null hypothesis.


To test for significance of an estimated coefficient β̂ in a (regression or ARIMA) model with p predictors, we test whether mean value of coefficients is equal to 0 (β0=0) :


The degrees of freedom decrease with each parameter. For instance, for linear regression model:


Because there are p predictors and one intercept.




An example in R language:




a=c(1,2,3,4,2,3,6,7,8)
b=c(1,2,3,4,8,2,4,1,2)
c=c(1,2,6,4,2,3,0,1,9)
d=c(1,2,0,4,2,1,0,1,8)
m1=lm(a~b+c+d)
summary(m1)
coeff=summary(m1)$coefficients[,1]
stderr=summary(m1)$coefficients[,2]
pdf=pt(q=(coeff/stderr),df=(length(a)-length(coeff)))
ifelse(pdf>.5,1-pdf,pdf)*2

Friday, March 3, 2017

Time series (theoretical concept)

A time series is a collection of random variables {Y1,Y2,…YT} ordered in time. There is a stochastic process {Yt} that generates the series. Each element Y1,Y2,…YT of the series is a random draw from a probability distribution. However we can only observe one particular realization of the stochastic process {y1,y2,..yT} in reality.



The stochastic process {Yt} is then described by a T-dimensional joint probability distribution. The(unknown) parameters (mean, variance, covariance) of the joint probability distribution of {Yt} are, for each t=1,2…T:



For a weakly stationary time series it holds that:
To infer the parameters from a particular observed realization, we assume the process to be ergodic (the sample moments approach the population moments as T becomes infinite).