INTRODUCTION
The evaluations of Higher Education Institutions (HEI) in Brazil have numerous justifications in its history, including preserving the quality and improvement of the educational management process (Amorim; Souza; 1994; Martins, 2009; Silva; Rosa, 2022). The current guidelines were established by Law No. 10.861/2004, which established the National System for Higher Education Evaluation (SINAES) and are coordinated by the National Commission for Higher Education Evaluation (CONAES). The National Institute of Educational Studies and Research Anísio Teixeira (INEP) is responsible for the entire process. The data from these assessments serve as subsidies to generate indicators that determine the directions and actions to improve the quality, commitment, and expansion of higher education in Brazil (BRASIL, 2004).
Paragraph 2 of Article 3 of Law No. 10.861 (Brasil, 2004) proposes that the evaluation of institutions should be analyzed by various procedures and instruments, called internal evaluation, or self-assessment, and external evaluation in loco. The Comissões Próprias de Avaliação (CPA – Self-evaluation Commissions) are responsible for self-assessments, which seek to evaluate, investigate, and propose reflections on the pedagogical practice and to fulfill the institution’s social role (INEP, 2017). The instruments aim to bring conclusions about the subjects taught, evaluation of the students, and positioning in relation to the education system itself, portrayed by independently analyzed dimensions.
However, assessments usually require the effort and interaction of students with extensive instruments. In fact, sometimes, this becomes a reason for disinterest in participating in the evaluation process (Nunes; Duarte; Pereira, 2017; Palitot; Santos; Brito, 2015; Pinto, 2015; Vieira; Kreutz; Costa, 2019). Another point is that institutional assessments were developed for established conceptions and education models, whether face-to-face or remotely.
These differences occur because, in the face-to-face teaching format, students and teachers are used to the type of structure and organization of the institution and to the roles defined by the formal classroom model (Anderson et al., 2019). In this model, the teacher, with their multifaceted attributions, organizes and designs the educational experience, being the provider and motivator of learning and directly available to be accessed within a controlled environment so that this knowledge is absorbed satisfactorily. In distance learning (DL) courses, teachers, and students are prepared for a more flexible learning process, which requires different behaviors and attitudes (Durli et al., 2018), as well as knowledge and skills in the use of information and communication technology tools (Bertolin; Marchi, 2010), in addition to adequacy in equipment and internet services to use virtual learning environments (Vieira et al., 2020).
Nevertheless, emergency educational standards were established with the sudden establishment of health protocols of social distancing to reduce the spread of COVID-19. Pedagogical activities were adopted in a remote format, which changed the educational model so that adaptations were needed in the organization, infrastructure, design, planning, and method of conducting the classes, as well as adaptations and modifications in how students access the content.
Even in these circumstances, higher education institutions, in addition to the obligation of institutional evaluations defined by law and the relevance of obtaining data on the quality of teaching methods, needed instruments to measure educational capacity. The measurement tools should be able to provide information on the technologies that make up the emergency teaching system and the methods used in the courses taken, especially regarding the students’ perception of learning in the meantime. However, due to the various nuances that each teaching modality has, remote emergency teaching did not cover all the characteristics of distance learning, in addition to requiring skills and elements that went beyond face-to-face meetings.
Given this context, this study innovates by seeking the creation and validation of an evaluation model of the disciplines by the student in emergency education. Additionally, with the establishment of Ordinance No. 2117 of December 6, 2019, which made it more flexible to offer up to 40% of the total workload of various classroom courses in distance education in HEIs, new scenarios for distance education have been created. Therefore, this study will also serve as a basis for future evaluation models of subsequent education systems that emerge as hybrid systems, which are shown as educational trends because they meet the needs and interests of undergraduate courses (Bertolin; Marchi, 2010). Nonetheless, even in a face-to-face system of education, there are special situations in which students remain in a special regime of home exercises, in which, many times, practices and teaching routines similar to those of emergency education are adopted.
THEORETICAL MODEL
Self-assessment aims to expand and qualify higher education to build institutional autonomy. Therefore, when aligned with the institutional development plan, it can be treated as a process of self-knowledge, development, and improvement in academic policies involving all institution actors (CONAES, 2007). Thus, by analyzing the scholary activities, failures in the quality of teaching and possible implementations or adaptations in the teaching-learning processes may be identified.
This perception is crucial, especially during the emergency period, because there were numerous changes and adaptations in building knowledge (Cunha; Silva; Silva, 2020). Gopal, Singh, and Aggarwal (2021) expressed concern among many educators in measuring how much emergency teaching is produced in terms of knowledge and whether it is enough for academic evolution.
Thus, the model that will be presented is the object of study for the evaluation of the subjects by the students. Based mainly on the studies listed in Table 1, three dimensions of analysis were defined: teacher, teaching-learning methodology, and discipline.
Table 1 Main authors that supported the dimensions
| AUTHORS | THEME | DIMENSIONS |
|---|---|---|
| Pintrich and DeGroot (1990) | Learner Characterization in Online Learning | - Discipline |
| Swan et al. (2008) | Online Learning Experiences |
- Teaching-Learning - Discipline |
| Schubert-Irastorza and Fabry (2011) | Satisfaction in remote teaching | - Discipline |
| Woodworth et al. (2015) | Teaching experience in the online environment |
- Teaching-Learning - Discipline |
| Eom and Ashill (2016) | Learning Outcomes and Satisfaction in Online Learning | - Teacher |
| Anderson et al. (2019) | Online Teaching Presence |
- Teaching-Learning - Discipline |
| Barber (2020) | Teaching during Covid | - Teacher |
| Loton et al. (2020) | Remote Learning during COVID-19 | - Discipline |
| Patwardhan et al. (2020) | Emergency Remote Learning |
- Teacher - Teaching-Learning - Discipline |
| Thurber and Trautvetter (2020) | Hybrid Learning Experiences | - Discipline |
| Wei and Chou (2020) | Online Learning Performance and Satisfaction | - Teacher |
| Gopal, Singh, Aggarwal (2021) | Impact of online courses during the pandemic on satisfaction and performance | - Teacher |
| Mei Yuan (2021) | Attitude and satisfaction in emergency remote teaching | - Teacher |
Source: Prepared by the authors (2023).
Some authors reported the ability of teachers as the biggest challenge of emergent teaching since most of them were used to face-to-face teaching methods and forced to ensure meaningful learning in an online class format (Baber, 2020; Peloso et al., 2020; Rodriguez et al., 2020), which many had not effectively experienced (Sher, 2009; Schubert-Irastorza; Fabry, 2011). However, some teachers had already employed online teaching methods as auxiliary activities. In contrast, others had the knowledge but did not put it into practice because, for the disciplines for which these teachers were responsible, such methods would not work or were not admitted; some teachers were even unaware of online tools (Patwardhan et al., 2020).
Nonetheless, with the sudden institution of distance protocols, these technologies have become partners with the class record books of educational professionals (Mei Yuan, 2021; Silva, 2007). Therefore, some details that until then were imperceptible became necessary due to the physical absence, as Eom and Ashill (2016) suggested when analyzing the issue of motivation and satisfaction of students through external stimuli, such as simple feedback from teachers. The literature also shows that many changes in perception can be assessed from the learner’s perspective when analyzing faculty involvement, presence, absence, and availability (Gopal; Singh; Aggarwal, 2021).
For Gopal, Singh, and Aggarwal (2021), the most influential factor in student satisfaction and the most related to academic development brought information about instructor quality, being considered adequate in attendance, support, and student management. Bertolin and Marchi (2010) also demonstrated that this responsibility should be shared with tutors in distance learning systems since the activities have evaluative specificities that differ from the face-to-face teaching model. However, in emergency education, many teachers did not have access to the help of tutors in the disciplines (Rodriguez et al., 2020), which may justify a false perception of low commitment and little teaching involvement during the COVID-19 pandemic.
Thus, with the Teacher dimension, the goal is to identify the student’s satisfaction regarding the involvement and the adaptation of the teacher to emergency teaching. Some authors, when evaluating the teacher, also suggest that the instrument should measure the management of mediation through online means, interactivity, and interactions (Baber, 2020), as well as instructional presence (Anderson et al., 2019), in addition to analyzing how satisfactorily technological tools were used for synchronous classes (Patwardhan et al., 2020; Swan et al., 2008).
The context of student satisfaction cannot be summarized only to the teacher’s performance since many items are perceived that can influence the teaching of students in a non-face-to-face system. Studies show an influence in the form of readjustment of content, with the adaptation of class time, materials, and tools used (Hoq, 2020; Mei Yuan, 2021; Rodriguez et al., 2020). Alqurashi (2019) reported that satisfaction in an online environment is also linked to the learning experience, the way knowledge was built over time, and the student’s interaction with the discipline.
All these predictors tend to evaluate the Teaching-Learning dimension, which is also influenced by the concrete establishment of the objectives of the discipline, the organization of the virtual environment, and the quality of the materials provided (Eom; Ashill, 2016). Moreover, the difficulty of evaluative activities and the low quality of synchronous classes may complicate the learning process (Wei; Chou, 2020). Likewise, the videos are made available in asynchronous activities, which may demotivate teaching items, including establishing the learning routine and the student’s involvement with the proposed activities (Mei Yuan, 2021).
During the learning process, Anderson et al. (2019) reported that some precepts become necessary for compensatory situations to occur during the migratory process from verbal to non-verbal text-based learning. In addition, during the adjustments, the teacher’s role in class as a facilitator, thus encouraging self-directed learning, can be interpreted as a sense of absence on the part of the students (Anderson et al., 2019). This lack of presence and social interaction in asynchronous classes, adopted during the COVID-19 pandemic, are highly relevant issues in Baber’s study (2020). The authors demonstrated that there are strong barriers during the learning process, when exchanging experiences, and even in developing oral communication skills.
Moreover, from a theoretical point of view, activities that do not have an appropriate degree of difficulty can discourage learning and even discourage students from continuing with the course (Swan et al., 2008). Uninteresting materials can strongly motivate students to not complete class assignments (Hoq, 2020; Pintrich; Degroot, 1990). When added to the demonstration of little concern for instructional design (Woodworth et al., 2015), the perceived lack of course organization (Patwardhan et al., 2020), as a whole, and the failure to link the subjects clearly with the context of professional practice (Schubert-Irastorza; Fabry, 2011), can significantly and negatively influence student evaluation.
Besides the intrinsic details of learning, other concepts can influence student evaluation. Perhaps they are not entirely latent in the initial years of the courses. However, when one has already undergone most of the undergraduate course, or even during graduate studies, there must be recognition of the relationship between the subjects studied and future professional life or even the visualization of the practical application to what is proposed in the classroom activities (Swan et al., 2008). Because of this, it is sought to measure the perception of students through items that bring information about the proposal of subjects, portraying how the respondents see the suitability of the class content and their importance to the DL format (Thurber; Trautvetter, 2020), and how much they motivate students (Loton et al., 2020), contribute, and are helpful to train these students (Sher, 2009).
However, Bertolin and Marchi (2010) describe the disciplines that may be suitable for semi-remote environments as incentives for reading and potential professional qualification because they are motivators for using tools and methods that encourage, provoke, and instigate research.
The items that formed these constructs sought to analyze and evaluate the class content studied, the pedagogical practices, and the teachers and instructors, in addition to their activities and actions. Thus, they reflected and represented the students’ views and evaluation of the subjects offered by the institution during the emergency remote teaching.
The student’s vision or perception during the assessment is related to their satisfaction concerning several determinants, which, according to Venkateswarlu, Malaviya, and Vinay (2020) and Yunusa and Umar (2021), go beyond a state of mind. These determinants have a strong relationship with the quality of the faculty members, the opportunities provided during participation in the course subjects, the relationship with the professors, the identification, clarity, and organization of the class content, as well as the relationship of this learning to employability after completing the course.
The study was based on Churchil (1979), Pasquali (2010), and Boateng et al. (2018), who defined and established the best practices in scale development. All stages of the study are listed in Table 2.
Table 2 Summary of the scale development techniques
| Stage 1: Item Development | Step 1: Identification of the Domain(s) and Item Generation | Identification of the items and dimensions in the literature. |
| Step 2: Content Validity | Evaluation meetings with the CPA and experts’ validation and adaptation of the content. | |
| Stage 2: Scale-up | Step 3: Pre-test questions | Semantic validity of the items performed by 10 students. |
| Step 4: Survey administration and sample size | Definition of the sample and the instrument’s application process by the UFSM Questionnaire System. | |
| Step 5: Item Reduction Analysis | Exploratory Factor Analysis: dimensionality and dimension reliability checks and empirical item analysis. | |
| Step 6: Factor Extraction | ||
| Stage 3: Scale evaluation | Step 7: Dimensionality Tests | Confirmatory Factor Analysis, searching for convergent and discriminant validity of the factors; Structural Equation Modeling. |
| Step 8: Reliability Tests | ||
| Step 9: Validity Tests | ||
| Final Stage: Development of the methodology for applying the evaluation scale. | ||
Source: Prepared by the authors (2023).
In the first stage, the study was characterized as exploratory since it sought documents and hypotheses that would subsidize the development of a model that would evaluate teaching at an odd moment in the history of education (Hair et al., 2014). It was a qualitative step that involved creating a preliminary instrument defining the dimensions and items of the questionnaire. These definitions are based on the literature review, searching for bibliographies that address the dimensions and justify the creation of the theoretical model and its constructs, as presented in the previous section. At the end of this stage, the preliminary instrument, the product resulting from the process, was evaluated by specialists seeking to refine and validate its content.
To do so, the minimum number of five experts was determined, which, from the perceptions, formed the content validity coefficient (CVC) and the Kappa coefficient. The answers measured a total average CVC of 0.9297 for relevance and 0.8830 for pertinence. Both are pretty close to the limit expected by the literature of 0.9 (Polit and Beck, 2006).
The coefficients resulting from Fleiss’ Kappa calculation, however, returned indices between 0.73 and 0.74, representing a moderate degree of total agreement (Kappa >0.61), according to the numerical interpretation of the result reported by Matos and Rodrigues (2019). For the instrument’s pre-test, the questionnaires were applied to ten different profiles of the target population. The participants showed no difficulties, so the initial questionnaire was kept.
From this stage on, the approach and purpose of the study were changed to quantitative and descriptive. The data was collected through survey research to achieve this stage’s objective. Thus, we used a questionnaire with 18 closed questions, with responses on a six-point Likert scale: 1 being “I strongly disagree” and 6 being “I strongly agree.”
The surveyed population comprised 25,582 students enrolled in at least one of the 4,920 disciplines of the 239 higher education courses. For this, the sampling method was used, as suggested by Mattar, Oliveira, and Motta (2021), which arrived at a minimum sample of 878 evaluated courses to obtain a confidence level of 95% and a sampling error of 3%.
SCALE EVALUATION
The data were organized using the SPSS 20.0® software. The AMOS 23.0.0 software was used for model estimation in structural equation modeling. This stage was divided into two phases: the first evaluated the measurement model of each construct individually, and the second evaluated the integrated model.
During the exploratory factor analysis (EFA), which was estimated by principal components, data factorability was analyzed using Barlett’s test of sphericity and the Kaiser-Meyer-Olkin (KMO) test. Based on the extracted communality, the items with values above 0.5 were maintained (Hair et al., 2014). For the analysis of factor reliability, Cronbach’s alpha was measured, and factors with internal consistency above 0.7 were accepted (Malhotra, 2012). With the factors examined, we started the confirmatory factor analysis (CFA). Sequentially, in search for reliability verification, the composite reliability and the average variance extracted (AVE) were analyzed, which are expected, respectively, values above 0.7 and 0.5 (Hair et al., 2014).
To calculate the AV E and composite reliability indices for the construct models, they should be satisfactorily adjusted. For this, the χ2 (chi-square), root mean square residual (RMSR), root mean square error of approximation (RMSEA), and foodness-of-fit (GFI) were calculated and analyzed. For comparative fit measures, which compare the proposed model to the null model, the following indices were evaluated: the comparative fit index (CFI), the normed fit index (NFI), and the Tucker-Lewis index (TLI).
There is no consensus in the literature for the index, χ2/GL, but it is recommended to be below 3 (Hooper et al., 2008). For the RMSR, Kline (2011) reported that it should present acceptable values below 0.10. The fit of RMSEA to the sample is accepted by Hair et al. (2014) when the values do not exceed 0.08. The last absolute measure of fit that this study analyzed was the GFI which, as Kline (2011) suggested, the expected value should be near 1.
Comparative adjustment measures were used to search for convergent validity. For this, we used the values of CFI, NFI, and TLI, which range from 0 to 1 and are recommended to obtain a level above 0.9 (Hair et al., 2014; Kline, 2011; Pedhazur; Schmelkin, 1991).
The discriminant validity was verified by evaluating how much the constructs differ from each other using the technique developed by Fornell and Larcker (1981). This technique proposes that the estimates of the square roots of the variances extracted from the pairs of factors analyzed should be higher than the correlation between the two constructs. The Kline criterion (2011) determines that the correlation between the constructs should not exceed 0.85.
To check if the construct is unidimensional, the standardized residuals of the construct indicators are considered; according to Garver and Mentzer (1999), these should be below 2.58 to be considered unidimensional. With the constructs tested, the final model was adjusted. To this end, all the adjustment indices of the constructs (i.e., CFI, GFI, NFI, TLI, RMSEA, and RMSR) were used, which were suggested to validate the individual models. They were also verified in the evaluation of the integrated theoretical model.
Lastly, when presenting the method for applying student assessment, we used, as suggested by Hair et al. (2014), the weighting of the weights of the standardized coefficients of each of the questions that formed the constructs.
RESULTS AND DISCUSSIONS
The application of the assessment instruments took place digitally using the Federal University of Santa Maria questionnaire platform from October to December 2021. The students were invited to participate through the university’s social networks and e-mail messages, indicating that the courses’ evaluations had already been released. As for the scales, the questions varied from strongly disagree, symbolized by the numeral 1, to strongly agree, with a numeric indication of 6.
During the tests, the Teaching-Learning construct was unsuccessful in discriminant validation, so the proposed integrated model, defined a priori, was rejected due to possible cross-loadings in the Teaching-Learning factor. As a solution, we started testing a new model uniting the constructs of Teacher and Teaching-Learning. This suggestion came from studies showing that some items from both dimensions were incorporated into the same factor (Mei Yuan, 2021; Patwardhan et al., 2020; Swan et al., 2008). Therefore, the validation of the new proposed model was composed of the constructs of Teacher and Discipline, which are presented below.
FACTOR VALIDATION
The factor analyses were estimated by the principal components’ method. Factor analysis was possible for both the Teacher and the Discipline dimensions since the Kaiser-Meyer-Olkin tests demonstrated good sample adequacy with KMO of 0.972 and 0.858, respectively. Bartlett’s tests of sphericity of the two dimensions were also significant, with X2 = 73598.869 and Sig<0.001 and X2 =16475.633 and Sig<0.001, respectively.
The variance explained for the dimension Teacher was 78.33%, and the item “Q7 The teachers acted in an integrated way during the development of the subject” was removed because it presented low communality (0.099). In the reliability exam, Cronbach’s alpha coefficient of 0.966 showed adequate internal consistency.
In the dimension Discipline, the items explain 82.87% after removing the item “Q17 I believe that this discipline is adequate to be offered in the DL model in a post-pandemic context (after returning to face-to-face classes),” which showed insufficient commonality (0.182). The items showed acceptable factor loadings and internal consistency Cronbach’s Alpha of 0.929, as summarized in Table 3.
Table 3 The dimensions of Teacher and Discipline with the respective items and factor loadings, explained variance, and Cronbach’s alpha
| Items | Variance explained | Cronbach’s alpha | ||
|---|---|---|---|---|
| TEACHER | ||||
| Q1 | The teacher used means of interactivity that contributed to the teaching-learning process. | 0.854 | 78.332 | 0.969 |
| Q2 | The teacher was committed to the students’ learning. | 0.877 | ||
| Q3 | The teacher was available to clarify doubts and questions about the subject. | 0.793 | ||
| Q4 | The teacher was actively involved in the development of online teaching. | 0.856 | ||
| Q5 | The teacher provided feedback on the evaluative activities. | 0.676 | ||
| Q6 | The teacher demonstrated mastery of the information and communication technologies used. | 0.773 | ||
| Q8 | The materials provided stimulated my learning. | 0.820 | ||
| Q9 | The virtual learning environment was well organized. | 0.809 | ||
| Q10 | The objectives of the course were clearly communicated. | 0.822 | ||
| Q11 | The evaluative activities were of an appropriate level of difficulty. | 0.697 | ||
| Q12 | Synchronous (real-time, “live”) classes have contributed to the learning process. | 0.736 | ||
| Q13 | The asynchronous (recorded) video lessons contributed to the learning process. | 0.688 | ||
| DISCIPLINE | ||||
| Q14 | I can establish relations between the contents of this discipline and other subjects, practices, and experiences of my course. | 0.905 | 82.87 | 0.929 |
| Q15 | I understand the relevance of this discipline to my education. | 0.932 | ||
| Q16 | I believe that the subject is appropriately placed in the advised sequence of the course. | 0.874 | ||
| Q18 | I believe that this discipline encourages my professional training. | 0.929 |
Source: Prepared by the authors (2023).
After the exploratory stage, we started the CFA. In the search for convergent validity and aiming to adapt the constructs to the pre-established limits of each index, correlations between the errors of the observed variables were introduced based on suggestions from the AMOS software and accepted if they made theoretical sense; notably, the changes were performed one at a time. The adjustment indices are presented in Table 4. The IM column shows the indices of the Initial Model (without any change or correlation proposed between the items), and the FM column shows the indices of the Final Model after the additions suggested by the software. However, when analyzing the dimension Discipline, unlike the Teacher construct, the initial adjustment indices assumed acceptable values with no need for interferences.
Table 4 Fit indices of the Dimensions
| Index | Limit | Teacher | Discipline | ||
|---|---|---|---|---|---|
| IM | FM | IM | FM | ||
| x2 (value) | --- | 3719.612 | 477.764 | 6.307 | |
| x2 (probability) | >0.05 | 0.000 | 0.000 | 0.043 | |
| x2 /GL (degrees of freedom) | < 5 | 68.882 | 17.063 | 3.153 | |
| Goodness of Fit | > 0.95 | 0.865 | 0.985 | 0.999 | |
| Comparative Fit Index | > 0.95 | 0.950 | 0.994 | 1.000 | |
| Normed Fit Index | > 0.95 | 0.950 | 0.994 | 1.000 | |
| Tucker-Lewis Index | > 0.95 | 0.939 | 0.986 | 0.999 | |
| Root Mean Square Residual | < 0.08 | 0.027 | 0.009 | 0.002 | |
| Root Mean Square Error of Approximation | < 0.06 | 0.117 | 0.057 | 0.021 | |
| Composite Reliability | >0.7 | 0.971 | 0.932 | ||
| Average Variance Extracted | >0.5 | 0.740 | 0.773 | ||
Source: Prepared by the authors (2023).
Both dimensions obtained values below 2.58 in their standardized residuals. Such values are considered by Garver and Mentzer (1999) as acceptable to represent the unidimensionality of the constructs. Thus, the constructs were validated for their convergence and unidimensionality. The Fornell and Larcker (1981) criterion was used for discriminant validity. We verified that the value found (0.780) was lower than the square root of the AV E of the constructs (0.860 teacher and 0.879 discipline). Sequentially, Kline’s (2011) rule was applied, stating that the correlation between the constructs (0.780) should not exceed 0.85. Therefore, both criteria confirm that the constructs have discriminant validity.
VALIDATION OF THE INTEGRATED THEORETICAL MODEL
The structured model’s CFA was performed using the maximum likelihood estimation method. First, the magnitude and statistical significance of the standardized coefficients were analyzed, and the fit indices of the IM of Table 5 were checked. It can be observed that almost all fit indices were adequate.
| Index | Limit | Theoretical model | Theoretical model |
|---|---|---|---|
| IM | FM | ||
| x2 (value) | --- | 2097.973 | 1252.253 |
| x2 (probability) | >0.05 | 0.000 | 0.000 |
| x2 /GL (degrees of freedom) | < 5 | 27.246 | 17.154 |
| Goodness of Fit | > 0.95 | 0.952 | 0.969 |
| Comparative Fit Index | > 0.95 | 0.979 | 0.988 |
| Normed Fit Index | > 0.95 | 0.978 | 0.987 |
| Tucker-Lewis Index | > 0.95 | 0.967 | 0.980 |
| Root Mean Square Residual | < 0.08 | 0.034 | 0.019 |
| Root Mean Square Error of Approximation | < 0.06 | 0.073 | 0.057 |
Source: Prepared by the authors (2023).
However, the index of x2 /GL was slightly misadjusted, and the RMSEA was well outside the expected value. Hence, in the search for the adequacy of these indices, which were not within the predefined limits, we followed the suggestions of the AMOS software. Correlations were inserted between the errors, which had theoretical coherence, although these associations were low and had little impact on the adjustment indices.
When the insertions of correlations were finalized, the adequacy of the adjustment indices occurred, as shown in the FM in Table 5, except for the value of the chi-square ratio by the degrees of freedom (x2 /GL =17.154), which remained outside the expected limit. However, this discrepancy can be justified by the sensitivity of this adjustment to sample size, since Hair et al. (2014), Byrne (2010), and Kline (2011) reported that inflated chi-square ratios can be perceived when samples are considered large.
Therefore, the proposed scale was obtained (Figure 1), showing the final model of the dimensions integration, the standardized coefficients of the items that compose each construct, and the regression coefficients of the dimensions as formers of the students’ assessment. All the proposed relations presented significance at the 1% level, thus demonstrating that the constructs Teacher and Discipline can measure the student evaluation of the disciplines.

Note: *p < 0.01;1 Z value not calculated; the parameter was set to 1 due to model requirements. To simplify the picture, the values of the correlations between the errors were omitted. Source: Prepared by the authors (2023).
Figure 1 Proposed integrated model for student evaluation
The construct Teacher (coefficient 0.942) has the most significant influence on the outcome of the student evaluation. Among the items of this construct, item Q2 (coefficient 0.937), which deals with teachers’ commitment to teaching during the pandemic, showed the highest influential weight, followed by Q1 (coefficient 0.931), which evaluates teachers’ dexterity with interactive media. Thus, our findings corroborate Swan et al. (2008), who described the strong influence and association of teaching presence with teaching actions.
When integrated into the model, the items that stood out in the Discipline construct were Q14 (coefficient 0.907) and Q18 (0.866). The first relates discipline content to course practice, and the other to students’ professional lives. Both are treated in Gopal, Singh, and Aggarwal (2021) as variables affecting student satisfaction and performance in online courses.
METHOD FOR USING THE STUDENT EVALUATION INDICATOR
Initially, how the standardized measures of each construct were developed are presented, which formed the student evaluation. The evaluations of each construct are carried out utilizing the weighted average of the perceptions. The weighting occurs from the weights of the factor loadings of the items. This was done similarly for the dimension Evaluation; however, since it is a second-order construct, it was formed using the weights of the latent variable coefficients. In Table 6, the respective formulations are presented.
Table 6 Construction of the standardized measures of each construct/dimension
| Teacher = | (Q1 x 0.0899) + (Q2 x 0.0904) + (Q3 x 0.0840) + (Q4 x 0.0890) + (Q5 x 0.0767) + (Q6 x 0.0838) + (Q8 x 0.0848) + (Q9 x 0.0835) + (Q10 x 0.0851) + (Q11 x 0.0767) + (Q12 x 0.0798) + (Q13 x 0.0763) |
| Discipline = | (Q14 x 0.2705) + (Q15 x 0.2383) + (Q16 x 0.2329) + (Q18 x 0.2583) |
| Evaluation = | (Teacher x 0.5307) + (Discipline x 0.4693) |
Source: Prepared by the authors (2023).
Applying these formulations for each student in each subject gives the student’s assessment. The average of the student responses can be used for a comprehensive evaluation of the discipline. Once again, we obtained the students’ overall perception of the course’s subjects from an average of the perceptions for all the subjects. Lastly, on a more macro level, if the average of the perceptions of all the subjects is calculated, the student evaluation of the subjects offered by the institution is obtained.
CONCLUDING REMARKS
Given the differentials and the unique context presented to the world, this study sought to contribute to understanding student assessment content for that moment, building and validating a model that fulfilled the precepts established by the legislation but in a simplified way. To this end, we identified, in the literature, the fundamental dimensions for the evaluation of emergency education. Thus, an instrument was developed to evaluate higher education institutions from the students’ point of view, presenting an application method for such a model. We developed and tested a model formed by the constructs of Teacher and Discipline that sought to evaluate emergency teaching from students’ perspectives.
During the analysis, the respondents indicated great satisfaction regarding the teacher’s commitment, involvement, availability, and skills during a dark period in Brazilian education. This shows that, even in a difficult moment, through the absence of teachers’ training in online environments, work overloads, and low-quality access for a considerable part of students (Gusso et al., 2020), the teachers obtained excellent evaluations of the actions taken during the pandemic.
It is also highly relevant to note that, by comparing the competing models, the scale created, integrating the constructs, proved more economical than the unidimensional model. The main contributions of this study include reflections on possible improvements in the assessment instruments of the institutions with the practical application of the integrated model. Once the scale is used, researchers, agents, managers of HEIs, and even other interested parties may apply it in search of the definition of the student evaluation index of the institution during the COVID-19 pandemic. Additionally, it can serve as a theoretical basis for future research evaluating emerging hybrid education systems. It can also be handy for assessing students in special home-based activity regimes.
As for the limiting aspects of this study, one can mention that the student evaluation model was applied in a single university, requiring validations in other institutions in different health, economic, and social contexts. Another suggestion is to compare it with new models, which are more parsimonious than the one demonstrated here.














