PREDICTION
Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values.
Role / Importance
Time series analysis allows both descriptive and predictive analytics. Many industries, mine included, have very noisy time based datasets and many dashboards filled with time series data. Being able to separate trend, seasonality and error and then predict where will be in x units of time is very powerful from a decision making point of view.
The time-series analysis is a very important concept in Data Science.
It is basically done in two domains, frequency-domain and the time-domain.
Both of them play a vital role in intense computational analysis and also optimization science.
One example is the time-series forecasting, in which the output of a particular process can be forecast by analyzing the previous data, by various methods like exponential smoothening, moving averages, log-linear regression method, etc.
PROBLEM
Source Code
mydata<-read.csv("C:/Users/student/Desktop/sk/jobs.csv")
attach(mydata)
x<-Month
Y<-Total.Filled.Jobs
d.y<-diff(Y)
plot(x,Y)
acf(Y)
pacf(Y)
acf(d.y)
arima(Y,order = c(1,0,1))
arima(Y,order = c(0,0,1))
arima(Y,order = c(1,1,1))
mydata.arima001<-arima(Y,order = c(0,0,1))
mydata.pred1<-predict(mydata.arima001,n.ahead=100)
plot(Y)
lines(mydata.pred1$pred,col="blue")
attach(mydata.pred1)
head(mydata.pred1)
head(mydata.pred1$pred)
tail(mydata.pred1$pred)
Output
Assign a value to a name.
diff - Returns suitably lagged and iterated differences.
Plotting the graph
The function acf computes (and by default plots) estimates of the autocovariance or autocorrelation function.
Till lag 5 acf is good
Function pacf is the function used for the partial autocorrelations.
arima - Fit an ARIMA model to a univariate time series.
AR: Autoregression. A model that uses the dependent relationship between an observation and some number of lagged observations.
I: Integrated. The use of differencing of raw observations (e.g. subtracting an observation from an observation at the previous time step) in order to make the time series stationary.
MA: Moving Average. A model that uses the dependency between an observation and a residual error from a moving average model applied to lagged observations.
p: The number of lag observations included in the model, also called the lag order.
d: The number of times that the raw observations are differenced, also called the degree of differencing.
q: The size of the moving average window, also called the order of moving average.
Here value of aic and log likelihood is more than the previous value.
Here the aic value and the log likelihood is the lowest so since R uses the maximum likelihood estimation model, it will use the arima value with c(0,0,1) considering only calculating the moving average for one variable only.
Conclusion
There is a positive spike for likelihood that the number of jobs filled for the coming month will be more.
Comments
Post a Comment