Katerina Chryssou* and Eugenia Lampi
General Chemical State Laboratory, B’ Chemical Division of Athens, Department A’ Tsocha 16, Greece
*Corresponding author:Katerina Chryssou, General Chemical State Laboratory, B’ Chemical Division of Athens, Department A’ Tsocha 16, 11521 Athens, Greece
Submission: September 23, 2024;Published: October 10, 2024
Volume5 Issue1October 10, 2024
The work presented was to demonstrate the process of using AMOS to test a first order confirmatory factor analysis (CFA) model. The analyses were performed with the AMOS statistic package. The process of conducting a type of CFA within the framework of AMOS program (first order CFA) were illustrated based on the data collected from the analysis of ten copy paper samples tested, using a tensile testing machine as the measuring instrument. Our structural equation modeling system based tensile properties and the two main directions of paper MD and CD direction. We tested our model using adequate fitting indices like GFI which equalled 1, AGFI which had the value of 0.92, RMSEA which had the value of 0.000 for our default model, CFI with the value of 1.000, NFI with index value of 0.997 for the default model, TLI with value of 1.119 index value, and IFI with index value of 1.018 for our default model. The results of the proposed measurement model confirmed the hypothesized latent structures MD and CD, and specified how the observed variables MDSTRESS, MDSTRAIN and CDSTRESS, CDSTRAIN depended on the latent variables. Amos graphics showed finally the distribution of chi-square values based on the bootstrap.
Keywords:AMOS; Confirmatory factor analysis; Tensile strain; Tensile stress; Observed variable; Latent variable; Measurement model
A certain approach from a researcher may be altered when he has a thorough understanding of the constructs underlying the data from his experiments. The approach, confirmatory factor analysis (CFA), is a theory-testing procedure [1]. The calculation of tensile mechanical properties from stress-strain curves is a fundamental step in characterizing material behavior [2]. The use of the AMOS software package to perform confirmatory factor analysis [3] may be used to test measurements of tensile strength and strain for paper samples. Confirmatory factor analysis could uncover the underlying constructs of the experimental data. One of the questions that has yet to be answered concerning CFA and structural equation modeling is which fit statistic to use. The model will consult the chi-square statistic, the Bentler (1990) comparative fit index of CFI, the Jöreskog and Sorbon (1986) Goodness-of-fit Index, or GFI, and the root mean square residual, or RMSEA [4].
Instruments and materials
The tensile testing machine Zwick Roell Z2.5 BT1-FR 2.5th D14/2008, S.N. 181435/2008 was used for measuring the tensile strength of ten copy paper samples. The method for determination of the tensile properties of the ten copy paper samples was specified in ISO 1924-2 [5]. The tensile machine of 2.5KN force extended the paper test pieces at 20mm/min constant rate of elongation and measured the tensile strength and strain exerted. The paper test pieces were cut by a guillotine IDEAL 1043 GS, to dimensions of 15mmx210mm. The paper test pieces were pre-conditioned at 23°±2 °C and 30%±5%r.h. for 24 hours and then they were conditioned and tested at 23 °C±1 °C and 50%r.h.±2%r.h. for 16 hours in a conditioning chamber according to ISO 187 standard [6]. In this work we performed the analyses with the IBM SPSS Amos version 23 (2015) statistic package.
In this work we tried to demonstrate the process of using AMOS to test first order confirmatory factor analysis model. In Table 1 were included the SPSS data set we used. The number of cases was 10, and we had four variables. We assumed that our data was normally distributed.
Table 1:MD stress values in MPa and CD stress values in MPa and the corresponding values of MD strain in % and CD strain in % for ten copy paper samples.
In Figure 1 we presented a first order CFA model in AMOS. In the model the two first-order factors MD (direction) in paper and CD (direction) in paper explained the observed variables MDSTRESS and MDSTRAIN, and CDSTRESS and CDSTRAIN, respectively.
Figure 1:Two measurement Models in AMOS-SPSS. Regression weights were needed for parameter estimation and were fixed to 1 as shown in Figure 1.
The two latent variables MD direction, and CD direction, or latent factors, were the main cause of the observed variables. The two latent, or unobserved, variables, MD direction and CD direction, were those that could not be directly observed, but were estimated based on their two observed variables MDSTRESS, MDSTRAIN and CDSTRESS, CDSTRAIN respectively. There was a relationship between the measuring model for MD-direction in paper, and the other measuring model for CD-direction which was in the form of a covariance. Our model was a confirmatory factor analysis and included two measuring models in SPSS-Amos. The observed variables one for the MD-direction and the other for the CD direction in paper, MDSTRESS and MDSTRAIN, CDSTRESS and CDSTRAIN were the variables or factors that were directly measured by the measuring instrument the tensile testing machine [2]. If we had considered the MD direction sub-model in Figure 1, then the scores of the two split-half subtests MDSTRESS, MDSTRAIN they would be hypothesized to depend on the single underlying, but not directly observed variable MD direction. According to the model, scores on the two subsets would have still disagreed, owing to the influence of error 1 (e1) and error 2 (e2) which represented errors of measurement in the two subsets (Figure 1). In this work we followed the steps of model specification, model identification, data collection, and parameter estimation, model fit assessment, model comparison, and presentation and interpretation of results.
In Table 2 assessment of normality the value of multivariate kurtosis of 1.330 was between -2 and +2 and was considered acceptable in order to prove normal univariate distribution [7,8]. Also, in the same table the skewness was between -0.8 to +0.8 for the four variables [7,8].
Table 2:Assessment of normality. Assessment of normality (Group number 1)
In Table 3 estimates, regression weights the p-values for MDSTRAIN and CDSTRAIN were less than the significance level (usually 0.05) and that meant that the model fitted our data well [9]. The predictor variables were statistically significant because their p-values nearly equalled 0.01. Also, in Table 3 the values of 1 in estimates for MDSTRESS and CDSTRESS implied that all the variability in the dependent variables MD direction and also CD direction, was explained by the independent variables MDSTRESS and CDSTRESS. The positive coefficient of 7.627 in Table 3, indicated that as the value of the independent variable MDSTRAIN increased the mean of the dependent variable also tended to increase. The p-value of 0.008 was the correlation between MDSTRAIN and MD direction.
Table 3:Estimates
Estimates (Group number 1 - Default model)
Scalar Estimates (Group number 1 - Default model)
Maximum Likelihood Estimates
Regression Weights: (Group number 1 - Default model)
In Table 4, standardized regression weights, [10] all the estimates resulting from the regression analysis were confined to the bounds of (-1, +1). These were the standardized regression coefficients and they showed that the predictors were uncorrelated and that the underlying data were not standardized, and the variances of dependent and independent variables were not equal to 1. In Table 4 we had endogenous variables that were those that had single-headed arrows pointing to them, which depended on other variables, i.e. MDSTRESS, MDSTRAIN, CDSTRESS, CDSTRAIN.
Table 4:Standardized Regression Weights: (Group number 1 - Default model).
In Table 5, covariances, a double headed arrow was used to draw covariance between the MD direction in paper variable, and the CD direction in paper variable. The estimate displayed in Table 5 was of the covariance between the MD direction and the CD direction. The covariance was estimated to be 0.001. Right next to that estimate in the S.E. column was an estimate of the standard error of the covariance 0.001. We could have used these figures to construct a 95% confidence interval on the population covariance by computing 0.001±1.96*0.001=0.001±0.002. Next to the standard error, in the C.R. column was the critical ratio obtained by dividing the covariance estimate by its standard error. Using a significant level of 0.05 the critical ratio 1.654 did not exceed 1.96 in magnitude and was not called significant. Since 1.654 was not greater than 1.96 we could not have said that the covariance between the MD direction and the CD direction was significantly different from 0 at the 0.05 level. The P-column [11] to the right of the C.R. gave an approximate two-tailed p-value for testing the null hypothesis, that the parameter value was 0 in the population. The table showed that the covariance between MD and CD was not significantly different from 0, with p=0.098.
Table 5:Covariances: (Group number 1 - Default model).
In Table 6 correlations the estimate value was found 1.088 and since the correlation coefficient was found close to 1.0, we could estimate that the relationship between the two variables MD direction and CD direction was strong. The positive correlation coefficient 1.088 indicated that the variable MD direction was dependent on the variable CD directly in paper.
Table 6:Correlations: (Group number 1 - Default model).
In Table 7, variances, Amos produced estimates for our default model. All possible variances were estimated. In Table 7 were displayed the variances of MD and CD, direction, as well as of the errors e1, e2, e3, and e4. The sample moments were the variances of MD, and CD, and e1, e2, e3, e4, and their covariance. The six distinct parameters that were estimated were the six population variances. The covariance was fixed at 0.001 in our model, nearly 0.
Table 7:Variances: (Group number 1 - Default model).
In Table 8 the squared multiple correlations were also independent of units of measurement. Amos displayed a squared multiple correlation for each endogenous variable. The squared multiple correlation of a variable was the proportion of its variance that was accounted for by its predictors. The CD direction accounted for 96.5% of the variance of CDSTRAIN. Also, the CD direction accounted for 73.9% of the variance of CDSTRESS. So, CD, CDSTRAIN, and CDSTRESS, contributed to the method of CD direction.
Table 8:Squared Multiple Correlations: (Group number 1 - Default model).
In Table 9, Matrices, the standardized indirect effect represented the relationship between the independent variables CDSTRAIN, CDSTRESS, MDSTRAIN, MDSTRESS, and the dependent variable CD, MD, through a mediating variable. It was a crucial measure as it helped us understand how much of the total effect could be attributed to the mediation process. In Table 9 the total effects were the sum of the direct effect plus all indirect effects [12]. The indirect effect was significant, the direct effect CD and CDSTRAIN, CDSTRESS was insignificant, and also MD and MDSTRAIN and MDSTRESS was insignificant, and the total effect, their sum, CD and CDSTRAIN and CDSTRESS was insignificant, as well as MD and MDSTRAIN, and MD and MDSTRESS. Also, the indirect effect was significant, the direct effect MD CDSTRAIN, MD CDSTRESS, and CD MDSTRAIN, CD MDSTRESS was significant and the total effect of the same variables was significant. The standardized total effects were below 1, as well as the standardized direct effects, and the standardized indirect effects, in Table 9.
Table 9:Matrices
Standardized Indirect Effects (Group number 1 - Default model)
The Amos output in Table 10, reported results for three models. The model we designed, also known as the default or proposed model, and the independence or null model, which said that each measured variable was correlated exactly 0.0 with each other measured variable, with no latent constructs, and thus produced results. The NPAR (number of parameters for each model) in the CMIN table showed that in the saturated (newly identified) model there were 10 parameters. There were 9 parameters for our tested (default) model, and for the independence model (one where all paths were deleted) there were 4 parameters (variances of 4 variables).
Table 10:Model Fit Summary.
CMIN
In Table 10, model fit summary, the GFI was as always less than or equal to 1. Here GFI [13] value was 0.992 and our model had a good fit of the data. GFI was the goodness of fit index. The required level of GFI was GFI>0.90. The GFI index value was 0.992 for the default model [14]. In HI 90amos reported a 90% confidence interval for the population value of several statistics. The upper and lower boundaries were given in the columns labelled HI90 and LO90. The RMR (root mean square resigual) was the square root of the average squared amount by which the sample variances and covariances differed from their estimates obtained under the assumption that our model was correct. The RMR of 0.00 for the default model indicated a perfect fit. The output showed that according to the RMR the default and the saturated model were the best among the models considered. The values of AIC, BCC, and BIC were used only for comparing models to each other, with smaller values being better than larger values. The AGFI (adjusted goodness of fit index) [13] took into account the degrees of freedom available for testing our model. The AGFI was bounded above by 1, which indicated a perfect fit. The level of acceptance for AGFI>0.9 and the index value was 0.915 for the default model. The GFI (goodness of fit index) was always less than or equal to 1. In our default model GFI was 0.992 indicating a perfect fit [14]. CFI was the comparative fit index. The CFI was truncated to fall in the range from 0 to 1. The CFI index value of 1.00 for our default model indicated a very good fit [14]. The TLI (Tucker-Lewis’s coefficient) had a typical range that lied between 0 to 1. But it was not limited to that range, and the TLI value close to 1 indicated a very good fit. Our model was fitting the data well, since TLI level of acceptance was >0.90, and our index value was of 1.119 for our default model. The IFI, Bollen’ s incremental fit index, value close to 1 indicated a very good fit as it was the IFI index value of 1.018 for our default model The RFI, Bollen’s relative fit index, if it had values close to 1 indicated a very good fit, as it was the RFI index value of 0.981 for our default model. The NFI level of acceptance was >0.9, and the index value was 0.997 (1) for our default model [14]. The NFI index was calculated as follows:
Our default model was a lot closer to the fit of the saturated model than it was to the fit of the independence model. We might have said that the default model had a discrepancy that was 99.7% of the distance between the terribly fitting independence model and the perfectly fitting saturated model. The independence model could have been obtained by adding constraints to any of the other models. Besides, any model could have been obtained by constraining the saturated model. The default model with x2=0.155 was unambiguously in between the perfectly fitting saturated model (x2=0) and the independence model x2=48.621 (Table CMIN).
ECVI, except for a constant scale factor, was the same as AIC. The columns labelled LO90 and HI90 gave the lower limit and the upper limit of a 90% confidence interval on the population ECVI. BIC was the Bayes information criterion. In comparison to the AIC, BCC, and CAIC, the BIC assigned a greater penalty to model complexity and therefore had a greater tendency to pick parsimonious models. The BIC index was reported only for the case of a single group where means and intercepts were not explicit model parameters [15].
The RMSEA was the population root mean square error of approximation. The columns labelled LO90 (i.e. 0.000) and HI90 (i.e. 0.651) for the default model, contained the lower limit and upper limit of a 90% confidence interval on the population value of RMSEA. A value of the RMSEA of about 0.05 or less, i.e. 0.000 would have indicated a close fit of the model in relation to the degrees of freedom. We had an exact fit with the RMSEA=0.000 [13,14]. The RMSEA required level <0.08, and the index value was of 0.000 for our default model.
PCLOSE was a p value for testing the null hypothesis that the RMSEA was no greater than 0.05. It should have been:
By contrast the p value in the P column was for testing the hypothesis that the population RMSEA was 0. The value of 0.694 was non-significant.
Based on their experience with RMSEA Browne and Cudeck [16] suggested that a RMSEA of 0.05 or less indicated a close fit. In our model was 0.000. Employing that definition of close fit, PCLOSE gave a test of close fit, i.e. 0.697, while P gave a test of exact fit.
In the model fit summary, the Hoelter’s critical N [13] was the largest sample size for which we would have accepted the hypothesis that our model was correct. In the default model Amos reported the Hoelter critical N 224 for significance level of 0.05, and the critical N 386 for significance level of 0.01 (Table 11).
Table 11:Bootstrap Distributions.
The fitness indexes values of RMSEA, GFI, CFI had achieved the required level. Therefore, our model was good enough for the analysis. The software package AMOS-SPSS was able to account for paper’s nonlinear stress-strain behavior. Of course, we have not found the best or the only model that fitted our stress-strain data from the measurements. Therefore, it would be important for us researchers to test more than one model when analyzing our data. We should try and find models that have less parameters estimated and make more sense empirically. Besides, confirmatory factor analysis tests the priori expectations of a researcher and encourages more meaningful and empirically based research.
© 2024 Katerina Chryssou. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.