INTRODUCTION
Institutional evaluation is an instrument that aims to provide an overview of the impression of a particular group, belonging to that academic community, on various aspects that permeate everyday life from the point of view of the educational process and its improvement. It demonstrates the satisfaction, or lack of it, through questions related mainly to the academic project and functions as a parameter that provides information that directly impacts the educational and pedagogical future of the Educational Institution1.
Medical education has a great influence on the health care of a population and must be constantly updated to respond to social demands regarding medical service. To this end, medical courses have been undergoing constant updates, both related to the content covered and taught in the universities, as well as teaching and assessment methods. Students are, therefore, an important reference for the evaluation of the effectiveness of teaching methodologies and course structuring2.
Medical education has found great strength in evaluative instruments for the improvement of its curriculum. Throughout the undergraduate course, the student has increasing contact with professional practice and must gradually bring theoretical content closer to medical experience. This requires an adapted pedagogical practice, which sometimes is not effective in teaching and teaching practice. To fill these gaps and bring the student closer to the construction of an educational project that best fits their reality, we use instruments capable of providing a collective impression of the performance of a given course3.
Feedback from students does not depend on the existence of a formal assessment instrument, and may occur in several instances, either through informal comments, inside and outside the academic environment, or through their results in assessment activities. However, the evaluation through a formal instrument brings advantages, such as the possibility of documentation and impact evaluation in a period, as well as the increase in the number of answers covering a larger number of students, thus increasing the veracity of the obtained results2.
There are several aspects that can be analyzed that, according to the literature, directly correlate with student satisfaction regarding the course. Among them are the assessment of the infrastructure of teaching spaces and supervised medical practice, through the performance of the faculty (teaching strategies, didactics, pedagogical planning, interaction with students, among others) to the assessment of the administrative units of student support4.
The perception that students are part of the process of building their learning is important for them to understand the need for institutional evaluation and also to want to build it. A large part of this understanding is given through the use of active methodologies, which make the student take responsibility for their learning and feel part of that educational community, seeking the improvement and advancement of the teaching process that also belongs to them5.
With this study, we intend to search databases for articles that have a pedagogical-institutional evaluation as their topic and describe experiences that comprise actions of student protagonism for internal institutional evaluation. By doing so, we intend to get a glimpse of how the various forms of evaluation are carried out around the world, their objectives, and their impact on medical education. Such knowledge can encourage the application of evaluations in institutions that do not do it yet, as well as can assist in the improvement and evolution of the previously applied questionnaires.
METHODOLOGY
This systematic review follows the standards established by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses for Scoping Reviews (PRISMA- ScR)6. This review was not registered and was produced without a prior protocol.
The search for articles took place between 12/12/2020 and 12/19/2020 and included the Lilacs, Scielo, Pubmed, and Embase databases and the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) theses database. The descriptors used were “Evaluation” and “Medical Education” and “Students” or “Undergraduate”.
Regarding the eligibility criteria, only studies published in the period from 2009 to 2019, that contained data on the methodology of this assessment, dealt with the undergraduate medical course in a specific way and in which the assessment made by the students was appreciated at some stage were included. Regarding the exclusion criteria, we excluded studies in which the student perception was about only one specific pedagogical intervention or that occurred in a limited number of disciplines, which described only the students self-assessments, as well as studies in languages other than Portuguese, English, French, and Spanish.
The PRISMA methodology aims not to restrict the search to a specific time period; however, we have chosen to limit the time frame to the past 10 years, from 2009 to 2019. This decision was made to increase the probability of including recent technologies in the identified studies and to reflect the most current developments in the exploration of the topic.
The Rayyan software was used during the search to identify possible duplicates that were excluded after human assessment, and to ensure unbiased and independent reviewer evaluation. Two reviewers evaluated the title and abstract of the 2355 identified articles, looking for those that possibly met the eligibility criteria, independently and unbiased, with a third independent reviewer in case of disagreements. Subsequently, two reviewers read the selected articles in their entirety, also independently and with an unbiased view, without the need for a third reviewer, and the reasons for article exclusion are depicted in Figure 1. During the full reading step for inclusion in the study, 12 citations with potential for inclusion were listed and evaluated by two reviewers.
A specific tool for assessing the risk of bias of the included studies was not used; however, the evaluation by two independent reviewers helped to minimize the risk of bias in the selection and evaluation of the studies.
A data extraction form was developed to collect information from the identified articles. The process was performed in each study by one author and then reviewed by an independent reviewer. Finally, the articles were presented to all authors together with the collected data.
The extraction form contained the following data: identification of the study; country in which it was conducted; type of study; sample of responses; semesters/periods of the evaluated course; timing of the course and periodicity of the evaluation; indicative of feedback to the academic community; presence of student protagonism; methodology of the study; use of validated questionnaires; aspects evaluated by the questionnaires; the presence of student self-evaluation in the questionnaire; presence of non-students in the respondent sample; and a box for comments.
The data chosen to be evaluated were selected based on the importance of the information for the comparative process among different studies and the objective of including studies that presented a well-established methodology and, therefore, contributed to the application of similar activities in other institutions.
This study was not funded by any institution or received any incentive or payment. All review authors declare no conflicts of interest.
RESULTS
The analyzed studies varied widely in their methodology, as they were conducted at different universities in different countries, taking into account the reality of each assessed course.
Regarding the moment in which the evaluation is carried out, most studies converge on the application at the end of the semester7)-(10. Sadeghi et al. (2016) refers to the end of the internship rotation period and in Dehghani et al. (2013) the students make a procedural assessment, which starts at the beginning of the semester and concludes at the end. The other studies did not inform in which period the questionnaires were submitted to the students11)-(15. The semesters the students are attending at the time of the study application were mostly the period of mandatory practice, which is the internship. Their specific periods can be seen in Chart 1. Only the articles by Ansari et al. (2017) and Arja et al. (2018) diverge from this population, being respectively about the clinical cycle and the entire medical course.
In terms of periodicity, most studies7),(8),(10)-(12),(14),(16)-(18) record a single application. Sadeghi et al. (2016) describes the applications as being done at the end of each internship rotation, whereas in the article by Arja et al. (2018) it occurs biannually for two years and in the study by Liang et al. (2018) on an annual basis (Chart 1).
AUTHOR | EVALUATED SEMESTERS | APPLICATION PERIOD | FEEDBACK TO THE ACADEMIC COMMUNITY | STUDENT PROTAGONISM |
---|---|---|---|---|
Sadeghi et. al. (2016) | Internship | At the end of each rotation | Present | Absent |
Dehghani et. al. (2013) | 8th and 9th semesters | One-time application | Absent | Absent |
Shamsan & Syed (2009) | 1st to 10th semesters | One-time application | Absent | Absent |
Ansari et. al. (2017) | Clinical Cycle | One-time application | Absent | Absent |
Arja et. al. (2018) | All | Biannual | Absent | Absent |
Alduraywish, et. al. (2017) | 5 academic levels, not specifying semester/year | One-time application | Absent | Absent |
Masic & Begic (2016) | Last undergraduate year | One-time application | Absent | Absent |
Ranasinghe et. al. (2011) | All | One-time application | Present | Present |
Masic (2013) | Last undergraduate year | One-time application | Absent | Absent |
Masic & Begic (2015) | Last undergraduate year | One-time application | Absent | Absent |
Khan et. al. (2017) | Fifth year | One-time application | Absent | Absent |
Liang et at (2018) | NI | Annual | Absent | Absent |
Abbreviations: NI = not included. Source: Prepared by the authors.
As for the evaluation methodology, in the questionnaire structuring, some of the studies described them as multiple-choice questions11),(12),(14),(17),(18, while the other portion corresponded to a mixed questionnaire, with multiple-choice and written questions7),(8),(10),(16),(18. The study by Liang et al. (2018) did not make it clear whether it obtained responses through written or multiple-choice questions. The article prepared by Arja et al. (2018) also conducted interviews with their audience and the article by Ranasinghe et al. (2011) used focal group discussions (Chart 2).
AUTHOR | EVALUATION METHODOLOGY | USE OF VALIDATED QUESTIONNAIRES | SELF-ASSESSMENT |
---|---|---|---|
Sadeghi et al. (2016) | Objective questions | Inspired by the World Federation of Medical Education (WFME) global standards | Absent |
Dehghani et al. (2013) | Objective questions | Dundee Ready Education Environment Measure (DREEM) | Present |
Shamsan & Syed (2009) | Objective questions | Adapted from Student Course Experience Questionnaire (SCEQ), The Good Teaching Scale (GTS), The Generic Skills Scale (GSS), The Clear Goals & Standards Scale (CGSS), The Appropriate Workload Scale (AWS), The Overall Satisfaction Item (OSI), The Appropriate Assessment Scale (AAS) | Present |
Ansari et al. (2017) | Objective questions | Inspired by The System for Evaluating Teaching Qualities (SETQ) and Stanford Faculty Development Program (SFDP26) | Absent |
Arja et al. (2018) | Interview and objective questions | Absent | Absent |
Alduraywish, et al. (2017) | Objective questions | Student Course Experience Questionnaire (SCEQ) | Absent |
Masic & Begic (2016) | Objective and subjective questions | Absent | Absent |
Ranasinghe et al. (2011) | Objective questions and focal group discussion (FGD). | Absent | Absent |
Masic (2013) | Objective and subjective questions | Absent | Absent |
Masic & Begic (2015) | Objective and subjective questions | Absent | Absent |
Khan et al. (2017) | Objective and subjective questions | Absent | Absent |
Liang et at (2018) | Online Questionnaire | Absent | Absent |
Source: Prepared by the authors.
As for the countries where the studies were carried out and the number of people who answered the evaluations, these data can be seen in Chart 3. The predominance of countries considered to be developing ones and located in the Middle East, Europe, Central America, and Asia is noticeable. The study with the largest number (2771) of students participating was that of Sadeghi et al. (2016) and the study with the smallest population, with 94 students, was the article by Khan et al. (2017).
AUTHOR | SITE | STUDY TYPE | SAMPLE |
Sadeghi et al. (2016) | Iran | Original Article | 2,771 |
Dehghani et al. (2013) | Iran | Original Article | NI |
Shamsan & Syed (2009) | Saudi Arabia | Original Article | 341 |
Ansari et al. (2017) | Bahrain | Original Article | 125 |
Arja et al. (2018) | Curaçao | Original Article | 400 |
Alduraywish, et al. (2017) | Saudi Arabia | Original Article | 170 |
Masic (2016) | Bosnia & Herzegovina | Original Article | 459 |
Ranasinghe et al. (2011) | Sri Lanka | Original Article | 186 |
Masic (2013) | Bosnia & Herzegovina | Original Article | 103 |
Masic (2015) | Bosnia & Herzegovina | Original Article | 365 |
Khan et al. (2017) | Bangladesh | Original Article | 94 |
Liang et at (2018) | Taiwan | Editorial | 203 |
Abbreviations: NI = not included. Source: Prepared by the authors.
The analysis showed that only the article by Ranasinghe et al. (2011) describes a study with student participation, with the students being the authors of the article in question. Only three of the articles mentioned that the evaluated teachers received the results of the evaluation subsequently, but all of them indicate that the evaluation led to changes in the institution, and in the study by Arja et al. (2018) these changes were in the process of implementing a new curriculum methodology. According to the study by Ranasinghe et al. (2011), the teachers received the assessments and some used them to promote changes in the subjects. The study by Dehghani et al. (2013) describes an evaluation done and presented in the departments systematically, once a semester, mentioning that it was possible to see the impact of the feedback brought by the evaluation on the averages of the evaluated subjects in the following semesters.
The analysis of the questionnaires of the three studies conducted by Masic8),(10),(16) will be performed as if they were a single study, considering that the studies have the same questionnaire and very similar methodologies. Regarding the aspects evaluated by each study, the teachers’ didactics were questioned in seven out of ten8),(10),(15),(16),(18) and in the study by Khan et al., (2017) they even asked the students to evaluate the teachers’ knowledge. The subject methodology was an equally present aspect, being identified in the questionnaires of seven studies8),(10)-(13),(16)-(18 and the study by Khan et al. (2017), particularly investigating several specific strategies of the subjects’ methodology, such as simulation of logical reasoning in class, adaptation to the students’ method of learning, recommendation of extra material and approach to current issues, whereas Ansari et al. (2017) evaluated the extent to which the subjects promoted self-learning, without characterizing this directly as an aspect of the methodology.
Aspects of the teacher-student relationship were evaluated in four of the studies7),(10)-(12 as was the content of the curriculum, which was evaluated in six 7),(8),(10),(15),(16),(18 studies. Three of the studies reported that questions related to the students’ self-evaluation were inquired 11),(15),(17, with Shamsan & Syed (2009) extending the questioning to include the perception of learning and social self-perception. College infrastructure was also evaluated by five of the studies8),(10),(11),(15),(16, as well as evaluative methods8),(10)-(12),(16 The learning atmosphere was evaluated by two studies12),(17 as was compliance with the knowledge objects10),(12) with Ansari et al. (2017) also questioning whether the knowledge objects were presented clearly. Other issues were present in only one of the studies.
The vast majority of the questionnaires were original for the evaluation in question, even if they were constructed through adaptation or inspired by other questionnaires. Only two studies used ready-made questionnaires prepared by other authors, these being Dehghani et al. (2013), who applied the Dundee Ready Education Environment Measure (DREEM)19, and Alduraywish et al. (2017), who applied the Students Course Experience Questionnaire (SCEQ)20.
Most evaluations submitted their questionnaires to students only; however, it was observed in three articles (by Ansari et al., 2017; Arja et al., 2018; Shamsan & Syed, 2009) that teachers and/or tutors also answered a questionnaire in a different version than that answered by students.
Analyzing the risk of bias, the risk of selection bias was observed due to the exclusion criteria associated with the language of the articles; the chance of missing relevant studies published before or after the adopted time period and the risk of publication bias inherent to the systematic review methodology. To reduce the publication bias, the three studies8,10,16that have the same first author, were performed at the same university using the same questionnaire, regarding the comparison between questionnaires was discussed as one.
DISCUSSION
Regarding the periodicity of the evaluations, it is observed that most of the studies analyzed did not show recurrence, being collected on a single occasion and generally associated with a specific intervention, which is the curricular change of the University in the medical course. Ideally, the biannual/annual evaluations should demonstrate a closer approximation of the result to be able to reflect a real perspective of what a certain period of the undergraduate course impacted on different aspects of that academic community. However, there are great challenges in conducting periodic procedural assessments that assess students as they progress through the course. In a report published in the OECD in 2018 on college education quality assurance in Brazil, one of the obstacles to a more in-depth assessment that better denotes the aspects involving undergraduate courses lies precisely in the difficulty of articulating a unified and standardized process that is applied in a sequential form21.
Observing the sample sizes obtained in the studies, we found studies ranging from the most robust to those with the lowest significance. This also varied according to the time of evaluation and the semesters that were covered in each of them. The article by Ansari et al. (2017) depicted a collection of only one year and resulted in a sample of 125 individuals, one of the lowest found in the research. While the study by Sadeghi et al. (2016), which was the one with the largest sample of 2771 individuals, was conducted over four years and applied at each internship rotation, encompassing a significant population of college students. There is a limitation in assessing whether these values demonstrate an equivalent population portion to make an inference about the data, as the studies did not present the total number of their medical school students, resulting in a statistical gap and an inability to project the results observed in the group that underwent the assessment to the university institutional body by itself.
The predominance of developing countries in the results points to the association that economic development and university education can assume. Since college education is a highly relevant activity for the public sector, it is understandable that the costs to maintain it is justified with effective benefits to society. Thus, the evaluation of education is a coherent action for institutional improvement, monitoring of educational measures, teaching quality, and efficiency in the return to society22.
The studies that included teachers as participants submitted to some evaluation questionnaires express the objective of evaluating interventions in the course, such as a new teaching methodology or changes in the curriculum. The teacher is considered a determining factor in the quality of educational service. Therefore, the contribution of the faculty improves the evaluation process, both by allowing the individual to reflect on their teaching practices, allowing them to implement strategies for improvement and professional development23, and by allowing the identification of convergences or divergences when compared to the students’ answers24.
The three articles that described the feedback that the conducted evaluations gave to faculty members may reveal an interesting aspect about the relationship between how often these evaluations are done and their impact. Ranasinghe et al. (2011) describes that faculty members received their grades on the conducted evaluation and this led to occasional changes in some disciplines, but only that. Meanwhile, Dehghani et al. (2013) reports that evaluations, done systematically, are presented every semester to the college departments, and positive changes can be observed in the evaluation of the subjects in response to the changes made after the feedback provided by the evaluation itself. These results indicate that assessments done regularly, systematically, have a potential to promote continuous improvements in the curriculum, while occasional assessments have less potential to do so. It is also noteworthy the study by Arja et al. (2018), in which evaluations were done at a specific period during the implementation of new methodologies in the curriculum in order to ensure that the implemented changes would please the medical school students and faculty, and it was therefore necessary to do these evaluations successive times aiming to achieve satisfaction with the changes that were made.
The discreet presence of student protagonism observed in the production of the studies reinforces the need for more processes to be centered on medical school students, providing them with a voice and opportunities25. Launched by the International Association for Medical Education (AMEE), the ASPIRE Award is an initiative that aims to acknowledge “Student Involvement” within faculties of institutions that rely on the contributions of medical school students to shape the curriculum and faculty26. Before becoming participants and recipients of pedagogical proposals in the educational environment, medical school students are the very reason for their existence. Thus, the sense of responsibility towards their institution, coupled with a commitment to its well-being, advocating for appropriate improvements, and proposing relevant changes, demonstrates a group of medical school students with great potential for their personal development and institutional excellence.
Publications do not necessarily reflect the reality, since there may be several evaluations around the world that address the question of this study but were not published, limiting the results to the information that can be accessed through scientific databases. Additionally, publications in languages not covered by our inclusion criteria or carried out prior to the defined search period may have also been excluded.
CONCLUSION
There are still few studies that address institutional evaluations, and most of them still aim at evaluating curricular changes or general changes in the course, being sporadic and starting from directional instances, without student protagonism in their creation and application. It is possible that there are many institutions that carry out some form of periodic institutional evaluation, but have not published their methodology, or the impact that such assessments generate on the course. For further advances in this field, it is important that evaluation methodologies be published and can undergo scientific debate. Another improvement to be made in the field is a better description of the methodologies already used, making the questionnaires fully available, and clarifying details about how the data is applied and processed.
The findings of this study raise the possibility that most forms of institutional evaluation are not carried out with student protagonism, which could bring an important point of view to the quality of teaching provided at the institution and point out, in a targeted way, which topics are relevant for the medical students.