STRUCTURED, INDIVIDUALIZED, VIDEO-BASED COACHING TO IMPROVE TECHNICAL SKILL ACQUISITION

In the current era of a reduced hourly workweek and competency-based curriculum, it is becoming increasingly clear that the cognitive apprenticeship model of training, which dominated the surgical discipline for well over a century, may no longer be the single best strategy to train the surgeons of today. There needs to be a multifaceted approach to surgical residency training, especially with regards to improving technical skill acquisition. Video-based-coaching is quickly being adapted for use in many training programs due to its low cost and broad range of applications including self-assessment, operative evaluations, and bench model training. The challenge for most surgical residency programs is that there is very little discussion regarding the best way to train faculty as coaches, which particular coaching framework should be used, or how best to incorporate coaching into a training curriculum. The purpose of this research is to address these questions by comparing the implementation of a curriculum using structured, individualized, video-based coaching against self-assessment video analysis only as a way to improve technical skill acquisition in surgical residents participating in a vascular anastomosis simulation workshop.

vii

Chapter One: Introduction
If John Dewey was correct in his description of the association between experience and education, then "everything depends upon the quality of the experience which is had" (Dewey, 1938, p. 26). The education of surgical residents has undergone radical evolutionary changes over the past two decades, including a change to competency-based education, a mandatory reduction in the hourly resident workweek, and an intense focus on error free-practice. Because of these changes, residency programs have recognized that the apprenticeship model of training, pioneered by William Stewart Halsted in the early 20 th century, and based on the "see one, do one, teach one" method, may no longer be the most effective strategy to prepare surgical residents for practice (Gallagher, Jordan-Black, & O'Sullivan, 2012). In an attempt to strengthen surgical training, and ensure we are providing the public with educated, welltrained, technically competent surgeons, it is the quality of our residency training programs that surgical educators are currently trying to improve upon.
To attain this goal, graduate medical education has significantly increased its adoption and implementation of adult learning theory into its curriculum in order to improve resident experience, facilitate transfer of learning, and improve knowledge and skill acquisition. As noted by Carter and Gogia (2014), however, much of this implementation has been done without the input of qualified educators, leading to inconsistent results and varying degrees of success amongst programs. The basis of this dissertation research is specifically targeted at the acquisition of technical skill in junior surgical residents. This complex subject matter incorporates a variety of theoretical foundations of adult learning theory including experiential learning, social constructivism, motor learning theory, distribution of practice, deliberate practice, coaching, a model of skill acquisition, self-assessment, and feedback. The purpose of this dissertation is to explore these theories and concepts, specifically as they relate to surgical training, explain how they may help improve transfer of learning and skill acquisition, and discuss the current limitations that exist with regards to implementing these theories into practice.

Background Apprenticeship Model of Training
The Oxford English Dictionary defines an apprentice as: "A learner of a craft, bound to serve, and entitled to instruction from, his or her employer for a specified period" (Oxford English Dictionary, 2017). The most common method for training surgeons throughout the ages has, indeed, been through variations in the apprenticeship model. While there was no set standard for these apprenticeships, an individual in the 16 th century would begin their journey around the age of twelve or thirteen, and remain in an apprenticeship for five to seven years (Polavarapu, Kulaylat, Sun, & Hamed, 2013).
In its most basic form, the student learns the art and science of surgery through the direct observation and imitation of the actions of their mentor in both the clinical and operating room environment (Dunnington, 1996). The main advantage to a natural apprenticeship model, such as this, is that learning is situated in the context of a specific activity, where skills are directly observed. As Lajoie (2010) points out, however, the pitfall to this natural apprenticeship model, is that "cognitive skills are not readily observable and adaptations to such an apprenticeship model are necessary to support education more broadly" (p. 64).
Most apprenticeship models in place today, however, are based on the cognitive apprenticeship model. A cognitive apprenticeship is defined by Collins, Brown, & Newman (1989) as "learning through guided experience on cognitive and metacognitive, rather than physical, skills and processes" (p. 456). This form of apprenticeship revolves around a complex interaction between a mentor and student that is dependent on expert demonstration and careful guidance, especially in the early stages of learning. Building off this cognitive apprenticeship model, and recognizing that learning occurs through social interaction, Lave and Wenger (1991) developed their theory of situated learning and the model of legitimate peripheral participation. This theory attempts to explain the way in which learning occurs through socialization within what they call "communities of practice." More specifically, this model concerns the "relations between newcomers and old-timers, and about activities, identities, artifacts, and communities of knowledge and practice" (p. 285-286). Understanding our current model of surgical training allows us to better appreciate how, over the course of five years, surgeons move from their first intern year, where they are mostly observers on the periphery of participation, to their fifth and final chief year, during which they become functionally independent and are full participants within the surgical community of practice. This apprenticeship model, both in its natural and cognitive form, has been the mainstay of surgical training since its inception, and ultimately became the education gold standard at every surgical residency program offered throughout the world. How this process occurred, however, becomes more apparent when one takes a closer look at the way surgical training evolved from a trade to a profession over the last few centuries.

Historical Perspective
Europe. The first efforts to improve surgical training date back to Paris, 1210 AD, at the College of de Saint Come (Franzese & Stringer, 2007). It was here that attempts were made to identify academic surgeons, who had acquired training or attended a university, with the distinction of donning a long robe, from the less reputable, or shortrobe, barber-surgeons (Bodemer, 1983). Despite this distinction, however, surgeons were far from being considered "equal" to their medical physician counterparts (Loudon, 2000). The medical physicians supposed superior knowledge throughout this era gave them the theoretical authority to oversee the work of surgeons (Cope, 1959). This division between the two disciplines, however, would soon begin to weaken with the rapid growth of the volunteer hospitals in Europe between 1730-1800. This, along with the work of anatomist and surgeon John Hunter, allowed the "hospital surgeon" to emerge as a consultant of high professional rank (Temkin, 1951).
It was this group of surgeons that ultimately led to the formation of the Company of Surgeons in 1745 (Morgan, 1968). The Company of Surgeons recognized the importance of providing formal lectures, along with an apprenticeship, as part of the education and training of future surgeons. According to the Company bylaws, each "Master' surgeon could have up to three apprentices, and each were required to provide seven years of servitude (Wall, 1937). In addition to this, Company of Surgeons helped with the planning and construction of an anatomy theater, close to Newgate prison in West London, in order to facilitate the dissection of deceased criminals for instructive purposes (Robinson, 1984).
A key aspect of the Company's role, however, was to register and oversee a surgeons' apprenticeship, and then formally examine these surgeons upon their completion (Tyte, 2011). Upon completion of their seven years, examinations were to take place in Latin, and candidates were questioned on all aspects of surgery, which was defined as: not only the external and actual practice thereof, but also the internal speculation of the natural causes and remedies of all manner of infirmities or diseases incident to the said practice and profession and of the natures and qualities of all manner of emplasters, ointments, medicaments, baths, waters, drugs, and herbs pertaining thereunto. (Wall, 1937, p. 49) Due to their advancement of the craft of surgery, the Company of Surgeons was granted a new charter by King George the III on March 22, 1800, bestowing upon them the title of the Royal College of Surgeons (Cope, 1959). Despite this professional achievement however, the division between medicine and surgery continued to persist in Europe. During granting of the charter, Lord Thurlow, a member of the house of lords, commented, "There is no more science in surgery than in butchering" (Rogers & Münsterberg, 1906, p. 306). It wasn't until the 1850's when Germany began the standardization of examinations and licensing, as well as the recognition of formal specialties including surgery, that the process of substantial medical reform began taking place in Europe (Shryock, 1965).
United States. While this transformation of the education and training of surgeons was slowly gaining some momentum in 18 th Century Europe, the same cannot be said for the United States. Most medical practitioners in colonial America had little to no formal education, as they were either self-taught, or came into their craft through informal apprenticeships (Goldowsky,1988;Warren, 1958). Out of the nearly 4000 practitioners in 1776, only around 400 of them had a formal medical degree, and the majority of these came from the Edinburgh school of Medicine in Scotland (Bordley, 1976). Colonial practitioners of medicine, unlike their European colleagues, were expected to be a "jack of all trades," encompassing medicine, surgery, and handing out herbal remedies and medicines of the time (Grillo, 1999). In fact, in 1690, when congregationalist minister Samuel Lee of Massachusetts was asked about the undifferentiated character of medical practice in New England, he wrote: "Practitioners are laureated gratis with the title feather of Doctor. Pothecaries, surgeons & midwives are dignified according to success." (Gill, 1972, p. 19).
Aside from the lack of formal education available for those who wanted to practice medicine and surgery in colonial America, there was another factor limiting its progression. Unlike Europe, where hospitals had flourished for centuries, the only facilities that existed in America were almshouses, built by charities, that were set up to help deal with the old and the ailing poor (Stain, 2015). It wasn't until the first hospital in America, The Pennsylvania Hospital, opened its doors in 1751, that medical, and particularly surgical education, began to take shape on our soil (Williams, 1972). It was in this institution that the current apprenticeship model began to formally develop as "attending Physicians brought with them their students, or apprentices, to follow the practice of the house, to apply dressings, and render other assistance" (Morton & Woodbury, 1895, p. 479). This arrangement ultimately led to the opening of the first medical school in the original 13 colonies in 1765, The Medical College of Philadelphia (Fee, 2015). Following in the footsteps of the European model of medical schools, particularly Edinburgh, this medical college set strict requirements for admission, which included; education in the liberal arts, mathematics, and natural history; knowledge of Latin and preferably French; and a required apprenticeship of no less than 3 years with a reputable physician (Flexner, 1910).
The 19 th century was one of rapid growth for medical and surgical education, with hospitals and medical colleges opening their doors in most major cities. Unfortunately, oversight for these institutions was severely lacking, and it became apparent that the only state requirement for licensure was a medical school diploma (Miller & Weiss, 2008).
Sadly, even applicants who were barely literate were granted admission to these medical colleges, as the for-profit institutions competed to provide applicants with "the fastest, easiest, and cheapest education" (Kaufman, 1976, p. 42) as opposed to the best training.
It was this culmination of the questionable benefit of the education offered at these colleges, combined with insufficient medical knowledge, improper training, and laxity in state licensure which drove the formation of the American Medical Association (AMA) in 1847. As such, they saw to it that one of their first mandates was to improve medical training in America and assure physician competencies (King, 1983). Because of the incompetence of American medical education and surgical training during this time, students who wanted to further their education often turned to Germany and Austria (Grillo, 1999). One surgeon, in particular, decided to do just that, and forever changed the landscape of surgical training in the United States.
Halstead, the German System, and Churchill. William Stewart Halstead received an undergraduate degree from Yale in 1874, and his MD degree from the College of Physicians and Surgeons of New York in 1877. After practicing as a house physician at Bellevue Hospital in New York City from 1877-1878, Halstead realized he had acquired all of the training available to him in the United States, so he travelled to Europe to further his surgical studies in Vienna and Würzburg from 1878-1880 (Cameron, 1997). It was during this time, Halsted became acquainted with the surgical residency training model in Germany. This particular training model, which was much different than that offered in the United States, was built upon the premise that surgical trainees should receive an increasing amount of responsibility with each advancing year (Hamdorf & Hall, 2000).
It was this "pyramidal" model that Halstead sought to replicate when he joined the staff of the newly built Johns Hopkins Medical Center in Baltimore in 1889.
Modeled after what he had seen in Germany, this surgical apprenticeship was based upon the selection of eight interns annually. From this group, four stayed on for only one year, while the other four remained for varying terms. Of the initial eight, only one of them ultimately became the senior house surgeon, with the other three waiting their turn for preferment. The average term for one of these house surgeons was eight years in total (Grillo, 1999). It is important to note that in this model, resident advancement was not guaranteed. Halsted felt that this rigorous training in surgery would help produce academic leaders in both surgical practice and surgical research. He described the purpose of his residency program as follows: "We need a system, and we shall surely have it, which will produce not only surgeons, but surgeons of the highest type, men who will stimulate the first youths of our country to study surgery and to devote their energy and their lives to raising the standard of surgical science" (Halsted, 1904, p.273).

surgical training in the United States, and in 1928 the American Medical Association
House of Delegates approved the principals of this model to guide surgical residencies and fellowships (Hamdorf & Hall, 2000). Not everyone, however, bought into this model as they best way to train a surgeon. Edward D. Churchill, who became chief of surgery at Massachusetts General Hospital (MGH) in 1931, took issue with the strict "pyramidal" training system the Halsted model encouraged, particularly its focus on producing a single master-apprentice. Churchill believed in a more broad-based surgical training system, one where "the school concept of a group of masters, in which no single personality dominates the methods and technology of the institution" (Grillo, 2004, p. 949).
Based on this philosophy, Churchill proposed a model of surgical training where he would select six residents a year, for training that would last four to five years. At the end of this time, these residents would be fully trained in general surgery and be capable of independent practice (Helling, 2016). This "rectangular" model of surgical training was put into effect in 1940 at MGH in substitution of the "pyramidal" system popularized by Halsted. According to Grillo (2004), this model of training proved extremely successful because it " (1) succeeded in giving complete training to superior surgeons, (2) it eliminated the human wastefulness of the pyramidal system, and (3) it responded to the nation's surgical needs after World War II" (p. 951). Because of this, the Halsted model of surgical training that had been in place for over 50 years, was ultimately replaced by Churchill's rectangular system, as it better met the needs of both the surgeons and society as a whole.

Present Day Surgical Training. Today, general surgery training in the United
States is a five-year program that may be extended if the candidate chooses to pursue research training, which could lengthen this duration (Accreditation Council for Graduate Medical Education, 2019). While the rectangular model of resident selection, in conjunction with an apprenticeship model of training, are still the basis for surgical residency, graduate medical education has undergone significant changes since the start of the 21 st century. These changes include a focus on discipline specific core competencies, the institution of a reduced, 80-hour workweek for residents, and an emphasis on patient safety and error free practice. Because of this, the ability to define competent performance in practice has quickly become a major focus for all members of the American Board of Medical Specialties (ABMS). This effort was best highlighted in the 2001 ABMS, Accreditation Council for Graduate Medical Education (ACGME) joint report on surgical competencies, and the criteria needed for maintenance of certification (Ritchie, 2001). The impetus for these national initiatives were, in part, driven by two major events at the end of the 20 th century.
The first event was the tragic, unexpected death of a young woman, Libby Zion, in a New York hospital in 1984. Libby was a college freshman with a history of depression who was admitted to the emergency room with symptoms of fever, agitation and disorientation (Lerner, 2006). Six hours after Libby's admission, she had died from a cardiac arrest. Her death resulted from a drug interaction between her home antidepressant and a medication prescribed to her to control her agitation. Her care during those six hours was managed by an emergency room resident and an intern, and no attending physician was called, even as her condition deteriorated. After a long court battle, the investigation ruled that her death was a direct result of lack of resident oversight and resident fatigue (Patel & Popp, 2014).
The second event was the Institute of Medicine's (IOM) original publication "To Err is human" (Donaldson, Corrigan, & Kohn, 1999). In an effort to address patient safety within health care, data was randomly collected over a fifteen-year period from multiple healthcare institutions in New York, Utah, and Colorado in search of adverse events. The researchers defined adverse events as "an injury that was caused by medical management (rather than the underlying disease) and that prolonged hospitalization, produced disability at the time of discharge, or both" (Brennan et al., 1991). The culmination of this research found health care to be far behind other industries at ensuring basic safety and, based on the data, estimated that as many as 98,000 people die annually in American hospitals due to medical error. The report called for immediate action from all agencies to help reduce this number of deaths by 50% over the following five years.
In response to these two events, the ACGME implemented its resident outcomes project in July 2001, which delineated six specific core competencies of medical training against which all residents would be assessed (Mery, Greenberg, Patel, & Jaik, 2008).
These core competencies, which are now fully integrated into all ACGME residency programs, include medical knowledge, patient care, professionalism, interpersonal and communication skills, practice-based learning and systems-based learning (Swing, 2007).
In addition to these competencies, and in an effort to address patent safety concerns and resident well-being, the ACGME moved forward with a plan to reduce the hourly workweek for all residency training programs. In July, 2003, the 80-hour residency workweek officially went into effect (Philibert, Friedmann, & Williams, 2002), and dramatically altered the surgical training landscape.

Statement of the Problem
By 2010, it became apparent that the resident work-hour restriction was creating a new set of problems associated with all residency training. These problems included a growing trend towards a shift-work mentality, programs paying too much attention on duty hours and less focus on resident supervision, as well as the creation of ethical dilemmas for resident providers, such as having to choose between leaving patients to comply with duty hours, or staying to provide care and violate hours (Nasca, Day, & Amis, 2010). The cumulative effect of these work-hour restrictions had even further repercussions for surgical training, however. Despite an upward trend in overall surgical case volume, general surgery residents were experiencing a decrease in their operative case exposure (Watson, Flesher, Ruiz, & Chung, 2010). In another survey of surgical residency program directors, Mattar et al. (2013) reported that, of graduating chief surgical residents, 66% were unable to perform a major procedure unsupervised for more than 30 minutes, and for those performing laparoscopic procedures, 26% could not identify anatomic planes, 56% could not suture, and 30% could not independently perform a laparoscopic cholecystectomy.
Although another work-hour restriction was made in 2011 to limit first year residents to only 16 continuous hours of duty, this was reversed by the ACGME in the 2017 common requirements. This reversal was due to multiple studies showing, in part, a negative impact on resident education (Bolster & Rourke, 2015), an unintended increase in self-reported medical errors (Sen et al., 2013), and a significant decrease in operative experience for first-year surgical residents . Given this trend in general surgery training, surgical residents are, themselves, feeling less competent in their technical skills upon graduation. Coleman, Esposito, Rozycki, & Feliciano (2013) reported that nearly 40% of residents lacked confidence in their skills after five years of training and Fronza, Prystowsky, Darosa, & Fryer (2012) reported that graduating residents are feeling less competent in their ability to perform many "core" general surgical operations upon graduation. This was again echoed in a more recent review of 14 general surgery programs involving 872 faculty, 511 categorical residents, and over 10,000 procedures. In this study, George et al., (2017) found that US general surgery residents are still not universally ready to independently perform common "core" procedures upon completion of residency training. It appears that in our current training model, an unintended consequence of the policies put in place to improve patient safety, resident lifestyle and reduce medical errors have unintentionally affected the technical competency of our residents graduating from surgical training programs.
Given this trend in present day surgical residency, faculty educators continue to look for novel ways to improve surgical training and technical skill acquisition, both inside and outside the operating room. Because of its applicability to a wide range of specialties, simulation training has continued to gain traction over the last twenty years as a means of improving residents' technical skills. Simulation can be defined as "a technique to replace or amplify real experiences with guided experiences, often immersive in nature, that evoke or replicate substantial aspects of the real world in a fully interactive manner" (Gaba, 2004). The use of simulation training in surgery dates back to ancient India in 600 BC, where leaf and clay models were used to simulate the very first recorded operation, the forehead flap nasal reconstruction (Limberg, 1984). Since that time, surgeons have primarily looked to using cadavers and/or animal models to simulate surgery, but the downsides of cost, availability, anatomical variation, and ethical considerations can be prohibitive for many institutions (Rosen, Long, McGrath, & Greer, 2009). Because of these limitations, the use of inanimate bench models to teach technical skills were quickly adopted by most surgical training programs due to their low cost, portability, and ability for repeated use. Specialty specific bench model workshops were developed nearly forty years ago in the United Kingdom by the Royal College of Surgeons (Bevan, 1981). It is here that one of the first vascular anastomosis bench model simulation classes was held, which employed the use of a newly created Buckston Browne Arterial Jig, developed by vascular surgeon Dr. Roger Greenhalgh (Greenhalgh & Flack, 1981). This jig allowed easy demonstration and practice of the proper anastomosis suturing technique of an open aneurysm repair under direct supervision.
After the initiation of these workshops and the development of the Buckston Browne Arterial Jig in the 1980's, simulation workshops became much more common in surgical training programs. These early anastomosis bench models, however, were conducted largely without an emphasis on objective measurements of the participants. It was assumed, in these early classes, that the act of attending and participating in these workshops would ultimately ensure a resident's technical skill acquisition (Atkins, Kalu, Lannon, Green, & Butler, 2005). More recently, however, various models for evaluating a resident's technical skills associated with participation in vascular anastomosis bench models have been well described and validated in the literature (Datta et al., 2002;S. Schwartz et al., 2014;Wilasrusmee, Phromsopha, Lertsitichai, & Kittur, 2007).
Research has also shown a transferability of skills and a positive correlation between simulation training and technical competency in the operating room and clinical setting. Using a vascular anastomosis simulation workshop, Wilasrusmee, Lertsithichai, & Kittur (2007) found that time to complete anastomosis and grade of anastomotic leak was predictive of technical competency in the operating room. These positive results have also been shown in other specialty simulation workshops. In a review of twenty randomized control trials in laparoscopic surgery, Vanderbilt et al., (2015) found evidence of improved clinical performance in surgeons who underwent simulation-based training. In another systematic review of twenty-seven randomized controlled trials and seven comparative studies encompassing both laparoscopy and endoscopy, Dawe et al., (2014) concluded that "these studies provided strong evidence that participants who reached proficiency in simulation-based training performed better in the patient-based setting than their counterparts who did not have simulation-based training" (p. 1063).
Perhaps even more importantly, however, is the evidence suggesting an association between simulation training and improved patient outcomes (Barsuk et al., 2018;Brydges, Hatala, Zendejas, Erwin, & Cook, 2015). These outcomes have led surgical training programs to ultimately place a much greater emphasis on using simulation bench model workshops to improve junior resident's performance on competency-based procedural skills.
Advances in technology have also made alternative surgical training modalities, such as virtual reality and robotic trainers, very attractive to program directors and educators, but their implementation remains challenging and they continue to remain cost prohibitive for most residency programs (Nagendran, Gurusamy, Aggarwal, Loizidou, & Davidson, 2013). Due to its low cost and broad range of applications including selfassessment, evaluations, and use in bench model simulation, the use of video recording, with playback and analysis, has quickly found its way into residency training programs over the last decade. Building on this concept, some programs have now begun to incorporate video-based coaching as a method to help improve technical skill acquisition, but its use has not been standardized, leading to varied implementation and inconsistent results (Min, Morales, Orgill, Smink, & Yule, 2015).
What makes these efforts more complicated, however, is that despite the major policy initiatives that began at the turn of this century, surgical faculty and educators struggle to agree upon a validated, reliable way to assess technical competency in surgical training. Although the concept of assessing residents based on clinical-based competencies was initiated in 2001, it wasn't until the ACGME Milestone Project in 2014 that "technical skill" was added to the list of required competencies for surgical residents (Cogbill, Malangoni, Potts, & Valentine, 2014). Another reason assessing technical skill has proven difficult is because there continues to be a lack of consensus on a formalized definition of technical competency. Szasz, Louridas, Harris, Aggarwal, & Grantcharov (2015) note that one explanation for this is because the literature tends to use the terms competence and proficiency interchangeably. This is an incredibly important distinction, and one that this research intends to address using a model of skill acquisition.

Purpose of the Study
The purpose of this research is to employ a validated technical skills checklist to compare structured, individualized, video-based coaching against self-assessment video analysis as a way to improve technical skill acquisition in surgical residents participating in a vascular anastomosis simulation workshop. The hypothesis for this research project is structured, individualized, video-based coaching improves technical skill acquisition to a greater extent than self-assessment video analysis.

Research Questions
In RQ4: To what extent do the current posttest scores on time to fashion anastomosis and leak rate compare with the prior three years of resident vascular anastomosis scores?

Significance of the Study
In the current era of surgical residency training, faculty and educators need to structure curriculum in a way that efficiently promotes the transfer of learning and improves technical skill acquisition. Coaching, combined with playback video analysis, has been increasingly employed as a way to improve technical skill acquisition (Bonrath, Dedy, Gordon, & Grantcharov, 2015;Soucisse et al., 2017) but how this compares to self-assessment video review has yet to be explored in the surgical literature. Until now, our own local surgical residency program has not included any of this type of technology in our simulation labs. With regards to our vascular anastomosis simulation lab, specifically, technical skills have been assessed using only two primary variables; time to perform anastomosis and leak rate. I expect this research to demonstrate that it is feasible to train surgical faculty on how to adapt a structured coaching format into a surgical training curriculum in order to promote transfer of learning and improve technical skill acquisition. The findings from this research will add to the current literature on the use of video analysis for surgical assessment purposes and help further define the role of self-assessment video analysis as it pertains to technical skill acquisition. The goal of this research is to adapt what I learn from this simulation model and transition this to the operating room environment, specifically in training residents how to perform common core general surgical procedures such as laparoscopic cholecystectomy, and laparoscopic appendectomy.

Definition of Key Terms
To aid the reader, this section includes definitions of common terms that are used throughout the study.
Apprentice: "A learner of a craft, bound to serve, and entitled to instruction from, his or her employer for a specified period" (Oxford English Dictionary, 2017).
Coaching: "The art of creating an environment, through conversation and a way of being, that facilitates the process by which a person can move toward desired goals in a fulfilling manner" (Timothy Gallwey, 2002, p.177).
Consolidation theory: Memories remain in a state that is vulnerable to disruption immediately after learning a new task, and take time to become fixed (or consolidated) (Lechner, Squire, & Byrne, 1999).
Deliberate practice: "Individualized training activities especially designed by a coach or teacher to improve specific aspects of an individual's performance through repetition and successive refinement" (Ericsson & Lehmann, 1996, p. 278).
Feedback: "Specific information about the comparison between a trainee's performance and a standard, given with the intent to improve the trainee's performance" (Van De Ridder, Stokking, McGaghie, & Ten Cate, 2008).
Goals: "A cognitive image of an ideal stored in memory for comparison to an actual state; a representation of the future that influences the present; a desire (pleasure and satisfaction are expected from goal success); a source of motivation, an incentive to action" (Cochran & Tesser, as cited in Street, 2002, p.100).
Knowledge of Results: "Information, and how the subject will transform and use it will depend on the type and accuracy of knowledge of results and on the kind of motor task" (Adams, 1976, p. 90).

Knowledge of Performance:
"Augmented information about the movement pattern the learner has just made" (Schmidt et al., 2018, p. 438).
Learning: "The process whereby knowledge is created through the transformation of experience" (Kolb & Kolb, 2005).
Mentor: "A voluntary and active participant in the personal and professional development of the mentee, offering knowledge, experience, guidance, support, and opportunity for advancement" (Strowd & Reynolds, 2013, p. 244).
Reflecting-on-Practice: a way for practitioners to "think back on a project they have undertaken, a situation they have lived through, and they explore the understandings they have brought to the handling of the case" (Schon, 1983, p. 61).
Self-Assessment: ''The involvement of students in identifying standards and/or criteria to apply to their work and making judgements about the extent to which they have met these criteria and standards'' (Boud, 2013, p. 12).
Simulation: "A technique to replace or amplify real experiences with guided experiences, often immersive in nature, that evoke or replicate substantial aspects of the real world in a fully interactive manner" (Gaba, 2004).
Technical skill: "The ability to use surgical instruments in an effective and efficient manner and includes economy of motion and safe tissue handling" (Mellinger et al., 2017, p. 52).
Transfer of Learning: "The effective application by program participants of what they learned as a result of attending an education or training program" (Caffarella, 2012, p 5212-5213).

Chapter 1 Summary
This chapter presented an overview and discussion on the evolution of the apprenticeship model of surgical training in both Europe and the United States.
Landmark events surrounding medical and surgical training at the end of the 20 th century were identified in relation to the national policies put in place by the ACGME to ensure patient safety and reduce medical errors through competency based medical and surgical education. The purpose of the study was to employ a validated technical skills checklist to compare structured, individualized, video-based coaching against self-assessment video analysis as a way to improve technical skill acquisition in surgical residents participating in a vascular anastomosis simulation workshop. While a current review of the available literature has shown significance when comparing coaching to standardized training, there has not been any literature comparing coaching to self-assessment when benchmark videos are provided. The intent of this study is to add to the current literature on the use of video analysis, coaching, and self-assessment in relation to technical skill acquisition in junior surgical residents.

Organization of Study
This research is organized into five chapters. Chapter one introduced readers to the history of the surgical apprenticeship model and the current status of surgical residency training in the United States. The chapter also introduced the reader to the statement of the problem, purpose of the study, and research questions. The significance of the study was also discussed, including definition of key terms, and organizational structure of this research.
Chapter two will focus on an in-depth review of the literature, including key concepts and an overview of the theoretical frameworks that guided this research. More specifically, the reader will be introduced to a review of motor learning theory and its related constructs, including distributive practice, deliberate practice, and a model of skill acquisition. This will be followed by a discussion on Experiential learning and the related construct of Mentoring and Coaching as it relates to professional development and the literature surrounding video-based coaching in surgical training programs will be explored. Finally, a discussion on self-assessment will be provided in relation to its ability to assist with technical skill acquisition.
Chapter three is devoted to the methodological structure of this research, including the research design, sampling, participants, and instruments employed for data collection. This chapter will also address the research procedures employed, a discussion on the validity of the instrument used in this research, as well as the materials and methods used for data collection and analysis.
Chapter four will focus on the presentation of each research question with descriptive and inferential statistical analysis.
Chapter five will include an in depth discussion and interpretation of the findings, as well as the conclusion, limitations to the study, and suggestions for future research.

Chapter Two: Review of Related Literature
For this study, in order to properly address the stated problem, help guide the research design, and answer the research questions, I will explore the theory surrounding technical skill acquisition as well as the related constructs that support psychomotor development. This review will better help the reader place these theories in the context of the second-year junior surgical residents who participated in the vascular anastomosis simulation workshop around which this research project was designed. I will first provide a brief overview of motor learning theory, as this forms the basis of understanding human psychomotor development. This will be followed by a discussion on the related constructs of technical skill acquisition, including distributive practice, deliberate practice, as well as a model of skill acquisition and methods to facilitate learner movement through this model. These constructs will all be defined, described, and discussed in relation to surgical education. This will include a discussion on experiential learning, mentoring, and coaching. Finally, I will discuss the current available literature related to video analysis and coaching in the surgical arena and provide an overview of self-assessment and its role in performance improvement.
The limitations of professional ability, and the factors that contribute to an individual becoming an expert in a particular domain, have been widely researched since the 19 th century, starting with Sir Francis Galton. In Galton's published work, titled Hereditary Genius (1869), he correctly presented evidence that body size and height was genetically predetermined. In addition to this, however, Galton's theory incorrectly proposed that other innate mechanisms, such as exceptional performance, must be also transmitted from parents to their children. Galton acknowledged that while one can improve performance through practice, there was a ceiling or a limit to their abilities, regardless of how much practice they put in, because it was dictated by their genetic makeup. Galton emphasized that rapid performance improvements are only evident when one begins training, with subsequent gains becoming increasingly smaller until "maximal performance becomes a rigidly determinate quantity" (p. 15). Galton's theory widely became known as the Nature versus Nurture debate and, due to lack of research on motor learning in the early 20 th century, this belief made up the mainstay of popular culture until the middle of the 20 th century (Ericsson, Krampe, & Tesch-Römer, 1993).
The literature has shown that there are a host of other factors that contribute to performance excellence, including motivation and repeated exposure (Ericsson et al., 1993), coaching (Ericsson, 2004), and situational practice (Chase & Simon, 1973).
Before I address these concepts, particularly as they relate to this research, a discussion on basic motor learning theory is required, as it forms the foundation of any research into human psychomotor development.

Motor Learning Theory
Technical skills can be defined as "the ability to use surgical instruments in an effective and efficient manner and includes economy of motion and safe tissue handling" (Mellinger et al., 2017 p. 52). To help residents attain this level of motor competency, surgical educators should have an understanding as to how humans process information in order to produce long-term changes in their skilled behavior. Theories pertaining to motor learning have been in development since the late 19 th century, starting with investigations into sending and receiving Morse code (W. L. Bryan & Harter, 1897, 1899, and the acquisition of typing (Book, 1925). This research led to the explosion of behavioral psychologists including Thorndike's investigation into knowledge of results (Thorndike, 1927), Hull's classic work on conditioning (Hull, 1943), and Skinner's research on reinforcement (Skinner, 1965). The research presented in the following chapters, however, is grounded in two theories that were put forth by Jack Adams (Adams, 1971) and Richard Schmidt (Schmidt, 1975), and developed in the later part of the 20 th century. Jack Adams' closed loop theory of motor learning (1971, Appendix A) was built on the premise that a simplified stimulus-response behavioral theory, which had been developed using animal models only, was insufficient to explain skilled motor behavior acquisition in humans. Adams took particular issue with the fact that in order for a motor response to occur, some outside stimulus was required. This open-loop model, then, meant that as long as the motivational and habitual states remained constant, the same response (movement) would always be produced (p. 117). Adams realized that this model did not account for the effect feedback can have on the completion of a motor task, particularly feedback regarding error. As such, Adams' closed loop theory of motor learning was the first to be built around the premise of an empirical reinforcing event, or "knowledge of results" (Adams, 1976, p. 88). As defined by Adams, knowledge of results is "information, and how the subject will transform and use it will depend on the type and accuracy of knowledge of results and on the kind of motor task" (p. 90). In contrast to an open loop motor system, which, according to Adams, "has no feedback or mechanisms for error regulation" (p. 89), a closed loop system includes "feedback, error detection, and error correction as key elements" (p.89).
In addition to this construct, Adams describes two states of memory which have a determining role in any motor task, labeled as the memory trace and the perceptual trace.
The memory trace is responsible for initiating a movement, choosing initial direction and determining the early portions of movement for the criterion motor objective (Adams, 1971, p.125). The perceptual trace, on the other hand, is the reference mechanism and resulting movement that is developed from past experience and used to guide movement to the correct location along the prescribed pathway (p. 124). It is this perceptual trace, according to Adams, which allows an individual to become more accurate and confident, over time, in carrying out a motor response.
Acknowledging some limitations of Adams' theory, such as needing to first experience a correct location in order to accurately move to that location, one's ability to continue to learn without knowledge of results, and the lack of adaptability to open-loop systems (changing environment), Richard Schmidt (1975) proposed his Schema Theory on motor learning (Appendix B). Borrowing a concept that was originally introduced by Head (1925) in the Psychology discipline, a schema can be defined as "a rule, developed by practice and experience across a lifetime, which described a relationship between the outcomes achieved on past attempts at running a program and the parameters chosen on those attempts" (Schmidt, 2003, p. 367). More specifically, schema theory is based on the formation of a generalized motor program (GMP) for motor learning. Schmidt, Lee, Winstein, Wulf, & Zelaznik (2018) define a GMP as "a motor program for a particular class of actions that is stored in memory and that a unique pattern of activity will result whenever the program is executed" (p. 199). These patterns of activity are then further broken down into a recall schema, which is concerned with the production of movement, and a recognition schema, in which a "person can estimate the sensory consequences that will occur if that movement outcome is produced," (p. 383), and is analogous to the perceptual trace as defined by Adams.
Perhaps most notable about Schmidt's theory is his description of the various types of feedback that exist, and the role knowledge of results (KR) and knowledge of performance (KP) plays on motor learning. According to Schmidt (Schmidt & Wrisberg, 2004), feedback itself can be divided into two discreet forms, intrinsic and extrinsic.
While intrinsic feedback arises from the sensory information gathered from performing a task, it can either be proprioceptive (sensory information received from within the learner's body such as joints/muscles/tendons) or exteroceptive (sensory information acquired primarily through vision and hearing) in nature (p. 277). Extrinsic feedback (also referred to as augmented feedback), however, refers to information received from an outside source, in addition to the intrinsic feedback the learner has already received (p. 279). It is this type of feedback, according to Schmidt, that can affect one's performance, either through KR or KP.
Schmidt put forth a definition of KR which is slightly different, and more specific, than what Adams (1971) proposed. According to Schmidt, KR is "verbal (or verbalizable), terminal (i.e. post-movement) feedback about the outcome of the movement in terms of the environmental goal" (Schmidt et al., 2018, p. 343). Important to note in this definition is that KR, according to Schmidt, does not describe the performance outcome, but simply tells the learner whether he or she has achieved the goal of the performance. Schmidt makes two points about this type of KR. First, it is usually the only meaningful source of outcome information available to the learner; and second, KR is typically administered in verbal form as terminal feedback (opposed to concurrent) (Salmoni, Schmidt, & Walter, 1984). This form of feedback is different from KP, which Schmidt defines as "augmented information about the movement pattern the learner has just made" (Schmidt et al., 2018, p. 438). It is this type of feedback which is typically given by instructors or coaches to correct improper movement in learners, and can be given both verbally, as well as through nonverbal means, such as video replay. The fundamental concepts of Adams' closed loop model of motor learning and Schmidt's Schema Theory are paramount to the research that follows and provide a framework for understanding the following concepts which ultimately contribute to technical skill acquisition in junior surgical residents.

Distributive Practice
The concept of spaced versus massed training has been looked at considerably over the last century and a half (See Ebbinghaus, 1885Ebbinghaus, /1964Hull, 1943 for historical reference). Massed training conditions are defined as "those in which individuals practice a task continuously without rest," whereas spaced training conditions are defined as "those in which individuals are given rest intervals within the practice session" (Donovan & Radosevich, 1999, p. 795). The majority of this, and current, literature supports the notion that short practice sessions, with appreciable time intervals between sessions (distributive practice), leads to better acquisition and retention of skill than when a task is practiced in a continuous (massed) block (Donovan & Radosevich, 1999;Lee & Genovese, 1988). While interest and potential applications of this research remained strong in the educational literature (Adams, 1987), only over the last fifteen years has the surgical world begun to embrace this concept, and evaluate the outcomes of distributive training with regard to surgical skill acquisition. Moulton et al. (2006) looked at massed versus distributive training with junior surgical residents attending a microvascular anastomosis lab and found a significant difference in both skill retention and competency for those residents attending the distributive training. Gallagher et al. (2012) also adapted this style of distributive training to a virtual reality simulation curriculum for laparoscopic cholecystectomy skill acquisition. The authors were able to show better rates of performance improvement, and a statistically significant decrease in errors made, among the distributive training group. Even more importantly, however, was the author's finding that once a novice acquired skills through laparoscopic training, significant degradation of that skill occurred after 2 weeks of nonuse.
These studies within surgical training reinforce the concept that retention of motor skill appears to be task dependent and is influenced by the training interval. There are a few reasons why distributive practice is preferable when compared to a massed training schedule, especially within the realm of surgical skill development. First, incorporating breaks in practice allows for mental rehearsal in anticipation of the next scheduled skills session, which has been shown to foster the formation of neurologic changes in the brain that accompany motor skill development (Hall, 2002). Second, there is considerable mental and physical fatigue associated with skills training. This is especially true during the early phase of skill acquisition, when the high mental demands of learning the task can induce fatigue and can interfere with the cognitive aspect of learning a new task (Tsuda, Scott, Doyle, & Jones, 2009). Fatigue from sleep deprivation has also been shown to significantly impair psychomotor and cognitive skill development in surgical residents (Kahol et al., 2008). Perhaps even more significant is the fact that Branscheidt et al. (2019) has shown that the negative effects fatigue has on motor skill acquisition can even extend to subsequent practice days in the absence of fatigue.
Lastly, the benefits of longer intersession training intervals are consistent with the hypothesis of memory consolidation theory. Originally proposed by Müller and Pilzecker (1900) over a century ago, this theory suggest that learning does not create instantaneous, permanent memories. Rather, our memories remain in a state that is vulnerable to disruption immediately after learning a new task, and take time to become fixed (or consolidated) (Lechner et al., 1999). Periods of inactivity, rest or sleep, after a practice session, have also been found to play a significant role in the consolidation of long-term memory (Shea, Lai, Black, & Park, 2000). Perhaps even more importantly, is that this process is not only time dependent (Brashers-Krug, Shadmehr, & Bizzi, 1996;McGaugh, 2000) but also sleep dependent, where a delayed stage of learning occurs without practice, as a result of post-training consolidation. In essence, the literature supports the notion that longer sleep durations yield greater improvements in retention, particularly for procedural memories (Karni, Tanne, Rubenstein, Askenasy, & Sagi, 1994;Stickgold, James, & Hobson, 2000;Stickgold, Whidbee, Schirmer, Patel, & Hobson, 2000;Walker et al., 2003). Karni et al. (1994) and Stickgold, Whidbee, et al. (2000) further showed that this process was unable to take place during an interruption in rapid eye movement (REM) sleep, which ultimately prevents overnight performance gain. While this concept of distributive practice is an important component of skill acquisition, there are other factors involved which are also necessary to improve one's skill.

Deliberate Practice
Caffarella defines transfer of learning as "the effective application by program participants of what they learned as a result of attending an education or training program" (Merriam, Caffarella, & Baumgartner, 2007, p. 211). For surgical residency training programs, successful transfer of learning is a necessary requirement if we are to provide surgical residents with the knowledge needed to perform a technical skill. While distributive practice can ultimately improve these learned skills through spaced trials and memory consolidation, practice and repetition are not the only requirements for achieving expertise in a particular skill. This has been shown in surgical literature, where practice and repetition alone was not enough for novices to ever reach competency in arthroscopy and laparoscopy, despite repeated, sustained practice attempts (Alvand, Auplish, Gill, & Rees, 2011;Grantcharov & Funch-Jensen, 2009). These findings suggest, and are consistent with, newer research on skill acquisition, that there is another element responsible for performance improvement.
Contrary to Galton's theory of inherent physical talent, Psychologist K. Anders Ericsson suggests that experts are not born, but rather made through a process called deliberate practice. In conducting research on athletes, chess players, and musicians, elite performers are almost always introduced to their skill at an early age, reach peak career performance in their mid to late 20's, and need at least ten years of intense performance before they can reach expert, international level (Bloom & Sosniak, 1985;Chase & Simon, 1973;Ericsson, 2004). Expert performance in a domain is a result of practice and one's deliberate engagement and choosing activities that improve and maintain high performance (Ericsson et al., 1993). While this research has become popularized in today's culture as the 10,000 hour practice rule (Gladwell, 2008), this does not fully explain deliberate practice. Ericsson's original research (1993) finding was that consistent, gradual improvement in performance was only met under the following conditions; participants were instructed to improve some aspect of their performance on a well-defined task; participants were motivated to improve; participants received detailed immediate feedback on their performance; and participants were provided ample opportunities to improve their performance gradually through repeated practice. Only when these conditions are met, can the term "deliberate" be used to define practice.
With regards to surgery, specifically, while regular practice of a skill has been shown to be an important determinant of patient outcome (Halm, Lee, & Chassin, 2002), it does not fully account for skill level in surgeons. Using simulation training, research by Palter & Grantcharov (2014) as well as Hashimoto et al. (2015) have shown that current surgical training could be improved and that residents can and do reach a higher level of expertise in a skill through the implementation of deliberate practice. This concept has significant implications for surgical residency training, where, each day, novice residents are taught new technical skills that require concentration and repetition (distributive practice), in an effort to reach an acceptable level of everyday performance.
According to Ericsson, Nandagopal, & Roring (2009), this process typically takes about 50 hours, after which performance becomes automated, and individuals no longer seek to modify or improve their behavior, which leads to a stable performance plateau. This process emphasizes the fact that if one wishes to attain mastery or expert level of performance in a particular domain, deliberate practice is critical to this process. Before discussing this further, however, it is important to define the levels of skill acquisition that a novice learner may progress through, on their way to achieving expert level of performance.

Dreyfus and Dreyfus Model of Skill Acquisition
Based on their study of chess players, air force pilots, and army tank drivers and commanders, brothers Stuart and Hubert Dreyfus developed their model of skill acquisition (Dreyfus & Dreyfus, 1979). This developmental model, based on situated performance and experiential learning, posits that students pass through five levels of skill acquisition through formal education and practice. These levels include: novice, advanced beginner, competent, proficient, and expert (Dreyfus, 2004, Appendix C). According to the Dreyfus brothers, this theoretical model represents a progression from analytic behavior of a detached subject, consciously decomposing his environment into recognizable elements, and following abstract rules, to involved skilled behavior based on an accumulation of concrete experiences and the unconscious recognition of new situations as similar to whole remembered ones (Dreyfus & Dreyfus, 1986), p. 35).
Since its initial publication, the Dreyfus model has been adapted for use in virtually all disciplines, but has had substantial adaptation within the disciplines of education (Berliner, 2004), dentistry (Lyon, 2015), nursing (Benner, 2001), and medicine (Carraccio, Benson, Nixon, & Derstine, 2008). This model has particular significance to technical skill acquisition in the surgical discipline as, historically, competency in technical skills has simply been assumed after protracted exposure to procedures over the course of five years of training. The problem with this method, however, is that procedural experience in current surgical training is varied, unsystematic and unstructured (Lodge & Grantcharov, 2011). Furthermore, according to Szasz et al. (2015), there is a lack of agreement in what defines "competency," and this term is frequently used interchangeably with "proficiency" within the surgical discipline.
Merriam Webster defines competence as "the quality or state of having sufficient knowledge, judgment, skill, or strength (as for a particular duty or in a particular respect)," while proficiency is defined as the "advancement in knowledge or skill" (Merriam-Webster Dictionary, 2018.). Incorporating the following stages of the Dreyfus model into practice can provide surgical educators with a more critical understanding of the differences in these terms and the specific stages of skill acquisition that surgical residents may progress through in residency training.
Novice. This initial stage of skill acquisition is characterized by the recognition of discrete facts and features, which are clearly and objectively defined, and remain virtually context-free (Dreyfus & Dreyfus, 1986). A novice learner treats each situation as a new one and looks for appropriate rules to follow. For this level of skill, Dreyfus notes that it is important to deconstruct the task that is to be taught, as "the student needs not only the facts but also an understanding of the context in which that information makes sense" (Dreyfus, 2004, p. 177).
Advanced Beginner. Through significant practical experience, a novice learner can progress to an advanced beginner stage of skill acquisition. While these learners still demonstrate marginally acceptable performance, they are able to perceive similarities of concrete situations and can relate them to previous examples of the same experience (Benner, 2001). According to Dreyfus and Dreyfus (1986), practical experience is much more productive than any amount of verbal description at this stage. This learner, like a novice, can take in little of the situational aspects of performance, as they continue to "concentrate on remembering the rules they have been taught" (Patricia Benner, 1982, 404).
Competent. The competent learner is able to pull from their past situational experience to create a problem-solving format which allows them to adopt a hierarchical process of decision-making Dreyfus and Dreyfus (1986). More specifically, a "competent performer seeks rules and reasoning procedures to decide which plan or perspective to adopt," in order to prevent mistakes (Dreyfus, 2004, p. 178). In this respect, while they may lack the efficiency of a more experienced performer, they are developing the ability to manage situational contingencies (Benner, 2001). It is important to note in this stage that, unlike the novice or advanced beginner, who act according to strict rules, the competent performer becomes more vested in the resultant outcomes based on their actions (Dreyfus & Dreyfus, 1986).

Proficient.
A learner at the proficient stage of performance relies less on rules, is more emotionally involved in their tasks, and has the ability to discriminate among a variety of situations in order to choose the most effective action to accomplish their task (Dreyfus, 2004). More specifically, a proficient performer "while intuitively organizing and understanding his task, will find himself thinking analytically about what to do" (Dreyfus & Dreyfus, 1986, p. 29). In essence the situation is what ultimately guides the proficient performer's response (Benner, 2004).
Expert. This final stage of acquisition is marked by full engagement and efficiency on the part of the performer. According to Dreyfus, an expert "not only sees what needs to be achieved…he or she also sees immediately how to achieve this goal" (2004, p. 179-180). A curious distinction here is that responses at this stage in performance are reactive rather than studied and premeditated. Dreyfus & Dreyfus (1986) note that "when things are proceeding normally, experts don't solve problems and don't make decisions; they do what normally works" (Dreyfus & Dreyfus, 1986, p. 31; This five-stage model will be instrumental in guiding this particular research project as I employ a variety of principles of adult education to assist in the technical skill acquisition of junior surgical residents participating in a vascular anastomosis simulation lab. While this model has been widely adopted, it is not without its criticisms. Hargreaves and Lane (2001) have criticized the fact that a linear model of skill acquisition is unable to sufficiently explain the everyday experiences of learning.
Perhaps even more pertinent, as it pertains to social constructivism, the model has been criticized for its apparent lack of social structure or social knowledge (Purkis, 1994;Rudge, 1992). It is this point that needs further exploration, as the Dreyfus model does not seem to address how to encourage or assist learners through the stages of skill acquisition. The role of reflection is also minimized in this five-stage model and needs to be accounted for as it pertains to this research. Addressing these criticisms and concepts are crucial in assisting technical skill acquisition. Therefore, the experience that ultimately shapes learning needs to be reviewed, and the role of a more experienced other, also referred to as a coach, must be defined in relation to this construct. To help bridge these constructs, a brief discussion of Experiential learning theory (Kolb, 2014) is required. Dewey (1938) was one of the first theorists to recognize the role experience plays on education, stating "all genuine education comes about through experience" (p. 13).

Experiential Learning
Heavily influenced by this and the earlier work of Lewin (1935) and Piaget (1980), David Kolb further expanded on this concept, defining learning as "the process whereby knowledge is created through the transformation of experience" (Kolb & Kolb, 2005, p. 194). Adult educator Malcolm Knowles originally defined Andragogy as "the art and science of helping adults learn" (Knowles, 1980, p.43), and Kolb's Experiential Learning Theory (ELT) is now considered by many to be a fundamental theory in Andragogy.
Emphasizing "the central role that experience plays in the learning process" (Kolb, Boyatzis, & Mainemelis, 2001, p. 2), the theory itself is built upon the following six propositions: learning is best conceived as a process, not in terms of outcomes; all learning is relearning; learning requires the resolution of conflicts between dialectically opposed modes of adaptation to the world; learning is a holistic process; learning results from synergetic transactions between the person and the environment; and learning is the process of creating knowledge (Kolb & Kolb, 2005). Kolb (2014) further conceptualized that in order to learn from an experience, four different types of abilities are required: 1) a willingness to involve oneself in a new experience (concrete experience); 2) observation and reflection in order to view the experience from a variety of perspectives (reflective observation); 3) the ability to analyze one's actions so new ideas and concepts can be created from one's observations (abstract conceptualization); 4) The ability to problem-solve and make decisions, so new ideas and concepts can be used in actual practice (active experimentation).
According to Kolb, these adaptive learning abilities combine to form two dialectically related modes of a grasping experience (concrete experience and abstract conceptualization) and two dialectically related modes of a transforming experience (reflective observation and active experimentation), and it is up to the learner to "continually choose which set of learning abilities he or she will use in a specific learning situation" (Kolb, Boyatzis, & Mainemelis, 2001, p.3). ELT is typically depicted in an idealized learning cycle, and students are strongly encouraged to use all four components in order to enhance their learning (Appendix D). Important in this learning cycle is the assumption that the more often a task is reflected upon, there exists a greater opportunity for the learner to modify and refine their efforts.
It is this research that also led Kolb to propose a Learning Style Inventory (LSI) which identified four different learning styles associated with varying approaches to learning: Diverging learners, whose dominant learning abilities are Concrete Experience (CE) and Reflective Observation (RO); Assimilating learners, whose dominant learning abilities are Abstract Conceptualization (AC) and Reflective Observation (RO); Converging learners, whose dominant learning abilities are Abstract Conceptualization (AC) and Active Experimentation (AE); and Accommodating learners, whose dominant learning abilities are Concrete Experience (CE) and Active Experimentation (AE) (Kolb et al., 2001).
To put these learning styles into further perspective, Divergers are usually open minded, imaginative, and prefer to work in groups; Assimilators prefer reading, lectures and analytical models; Convergers are apt to experiment with practical applications of new ideas; and Accommodators tend to prefer hands-on learning experiences (Kaushik, 2017). As a point of reference for this research, this LSI has been looked at considerably in both the medical and surgical literature, as a way to identify the types of learners that make up these residency programs, in an effort to make training more efficient and effective. Adesunloye, Aladesanmi, Henriques-Forsythe, & Ivonye (2008) found that medical residents and faculty have a predominantly assimilating learning style, while research on surgical residents and faculty have found a predominantly converging type of learning style (Engels & de Gara, 2010;Mammen et al., 2007). Understanding the specific learning style of our trainees will allow educators, or a more experienced other, develop specific curriculums and programs which may better assist a learner achieve their full potential when learning a new skill. This task can best be conceptualized through formalized mentoring and coaching.

Mentoring and Coaching
Much like the criticism the Dreyfus model of Skill acquisition receives with regards to reflection (Peña, 2010), Kolb's ELT (2005), in its original form, has also been criticized for minimizing the role reflection plays in the learning process. Boud et al (1985) notes that while Kolb's model "has been useful in assisting us in planning learning activities and in helping us to check simply that learners can be effectively engaged," they comment, "it does not help… to uncover the elements of reflection itself" (p.13).
Philosopher Donald Schön (1983) describes the process of reflecting-on-practice as a way for practitioners to "think back on a project they have undertaken, a situation they have lived through, and they explore the understandings they have brought to the handling of the case" (p. 61). This is an incredibly important aspect of learning a new task, as it allows one to think about their own progress and determine the ways they may change their practice through experiential learning next time the task is encountered. The question now becomes, is it possible to help transform a learner's experience through this reflective process, and assist them with attaining higher levels of skill over time?
Acknowledging this question, Peno & Silva Mangiante (2012) created the Purposeful Ongoing Mentoring Model (POMM) to help learners move through the stages of skill acquisition through mentoring, scaffolding, and reflection. The authors adapt a definition put forth by Strowd and Reynolds (2013), who characterize a mentor as "a voluntary and active participant in the personal and professional development of the mentee, offering knowledge, experience, guidance, support, and opportunity for advancement" (p. 244). In this model, the mentor's relationship and actions towards the learner are guided by the work of Vygotsky (Vygotsky, 1978) and his concept of scaffolding. Vygotsky described learning as an active process in which learners construct their own understanding and knowledge of the world, through experiencing things and reflecting on those experiences. As part of this description of learning, Vygotsky described a zone of proximal development (ZPD) or "the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem solving under adult guidance, or in collaboration with more capable peers" (Vygotsky, 1978, p. 86).
By identifying a learners ZPD and employing the use of scaffolding, the mentor, or more capable peer "provides a model for a higher level of practice through demonstration and/or explanation (challenge) while supporting (coaching/feedback) the learner's attempts to make sense of and emulate what is being taught" (Peno, Silva Schön (2017) to promote the art of reflection on the mentee in an effort to "illuminate even subtle differences that need alteration" (p. 7), in an effort to improve practice.
According to the authors, the POMM model is "meant to supply a frame for purposefully thinking about, preparing for, and developing actions for the acquisition of higher levels of skill in any area of work" (Peno & Silva Mangiante, 2012, p. 13). While the POMM allows us to better understand how the relationship between a learner and a more experienced other can facilitate the reflective process and promote their movement through the Dreyfus model, I must differentiate between a mentor and a coach, as it pertains specifically to this research.
Coaching author Timothy Gallwey defines coaching as "the art of creating an environment, through conversation and a way of being, that facilitates the process by which a person can move toward desired goals in a fulfilling manner" (Gallwey, 2002, p.177). These goals can further be defined as "a cognitive image of an ideal stored in memory for comparison to an actual state; a representation of the future that influences the present; a desire (pleasure and satisfaction are expected from goal success); a source of motivation, an incentive to action" (Cochran & Tesser, as cited in Street, 2002, p.100).
According to Grant (2001), the difference between a Mentor and a Coach is usually based on the objective. In coaching, the objective is skill development and performance enhancement, while in mentoring, the objective is typically long-term career development. Grant also notes that while "mentoring traditionally involves an individual with expert knowledge in a specific domain passing on this knowledge to an individual with less expertise," the coach does not necessarily need to be an expert in the trainees area of learning, needing to "only have expertise in facilitating learning and performance enhancement" (p. 30).
Further distinctions between the differences between coaching and mentors have been adapted from Passmore (2007) and can be found in (Appendix E). For the purposes of this research, as the relationship between the facilitator and the learners took place over the course of five weeks, with a specific focus on technical skill improvement, I will refer to this relationship as "coaching," to stay aligned with the definitions provided.
Aside from this change, the goal of this research remains the same. In an effort to improve performance, I will employ the use of coaching, scaffolding and reflection on practice, as described in the POMM, to assist our learners in achieving a higher level of skill acquisition. Having described this concept, I now turn our attention to the current available literature related to video analysis and coaching in the surgical setting.

Video Analysis and Coaching in the Clinical Setting
In the clinical setting, feedback has been defined as "specific information about the comparison between a trainee's performance and a standard, given with the intent to improve the trainee's performance" (Van De Ridder et al., 2008). Soon after video cameras became widely available, video replay for analysis became established as an effective method for performance review, providing feedback, and contributing to selfassessment in both team and individual sports (Lounsbery & Sharpe, 1996;Winfrey & Weeks, 1993). Early adoption of video analysis, combined with some type of verbal feedback, as a method to enhance performance improvement in surgical training programs, however, failed to replicate these perceived benefits. Working with surgical residents in an orthopedic simulation lab, Backstein, Agnidis, Regehr, & Reznick (2004) were unable to demonstrate any significant improvement in technical skills when video feedback analysis was incorporated into their model. To assess whether this failed improvement in technical skill was due to participants receiving only a single exposure to video feedback analysis, the main authors of this research performed a second study using a vascular anastomosis bench model. Even when surgical residents were exposed to repeated video feedback analysis over the course of three weeks, the authors, again, failed to show any improvement in technical skill (Backstein, Agnidis, Sadhu, & MacRae, 2005).
More recent research, however, has begun to show some promising results when video feedback analysis was incorporated into their training programs. During a laparoscopic suturing workshop, for instance, Jamshidi, LaMasters, Eisenberg, Duh, and Curet (2009)  that the intervention group, who reviewed their video with a senior orthopedic faculty member prior to their next attempt, significantly performed at a higher level than the control group. This particular research highlights an important concept that may be responsible for augmenting video analysis, that is, the specific type of feedback and analysis provided.
Nesbitt, Phillips, Searle, and Stansby (2015) evaluated the effect of individualized video feedback with expert analysis, unsupervised video feedback, and standard lecture format in 35 medical students during a suturing workshop. While they found a significant difference using video feedback over standard lecture format, they ultimately concluded that students could attain similar levels of technical skill acquisition using either unsupervised video feedback or individualized video feedback with expert analysis. More recently, Phillips et al. (2017) evaluated the effectiveness of unsupervised video feedback versus direct expert feedback during a skills lab on intravenous cannulation, catheterization, and suturing. While they also found utility in video-assisted feedback, they failed to show a significant difference using direct expert feedback. These particular studies, however, also failed to ensure the feedback provided by faculty remained consistent across individuals and groups. Because of these inconsistencies, the use of surgical coaching, combined with video analysis, may be more valuable for helping surgical trainees improve technical skills, as it provides faculty with a standardized format to analyze and critique individual performance.
In a randomized controlled study evaluating the effect of video-based coaching on laparoscopic surgical skills, Singh, Aggarwal, Tahir, Pucher, & Darzi (2015) found that students exposed to video-based coaching experienced enhanced laparoscopic surgical performance compared to those who only participated in online tutorials and practice sessions. In a similar study, comprehensive surgical coaching, including video feedback analysis, was evaluated in surgical residents performing minimally invasive gastric bypass surgery Bonrath et al. (2015). Over the course of their eight-week rotation, the authors concluded that residents exposed to comprehensive, video-based coaching demonstrated improved technical skill, significant error reduction, and improved selfassessment compared with conventional training. More recently, a randomized controlled trial assessing surgical resident's ability to perform a side-to-side intestinal anastomosis on cadaveric dog bowel was also able to show a significant improvement in technical skills in the experimental group exposed to video-based coaching (Soucisse et al., 2017). Based on these studies, it appears that when structured coaching was instituted, in addition to video playback analysis, improvements in technical skills were observed when compared to control groups.

Self-Assessment
Self-Assessment has been defined as "a process of personal reflection based on an unguided review of practice and experience for the purposes of making judgments regarding one's own current level of knowledge, skills, and understanding as a prequel to self-directed learning activities that will improve overall performance and thereby maintain competence" (Eva & Regehr, 2007, p. s81). The importance of self-assessment has been vigorously researched over the last thirty years in medicine, and has been identified as a crucial aspect of professional self-regulation (Arnold, Willoughby, & Calkins, 1985;Boud, 1999;Gordon, 1991). As noted by Eva & Regehr (2005), however, most of these studies have casted doubt on an individual's ability to adequately perform self-assessment, with a lot of the conclusions being that overall, it is quite poor.
One potential explanation for this has pointed to the methodological weaknesses of these studies and their ability to adequately evaluate self-assessment (Ward, Gruppen, & Regehr, 2002). Research does show, however, that the potential of self-assessment is twofold; It can function as an identifier of one's weaknesses in a task and can also help identify one's strengths (Ross, 2006). Bandura's social cognition theory also recognizes that students who perceive themselves to be successful on a task are more likely to believe they will be successful in the future (Bandura, 1997). This is a powerful concept with regards to learning a new task.
It has long been shown, however, that learners tend to be poor evaluators of their own performance. Kruger & Dunning (1999) performed research on psychology students and found that low performers are unconsciously incompetent and tend to overestimate their ability, while high performers underestimate their ability. These results were also replicated in medical students (Bryan, Krych, Carmichael, Viggiano, & Pawlina, 2005) and medical residents (Parker, Alford, & Passmore, 2004). When Hu, Tiemann, & Brunt (2013) evaluated medical students and surgical interns performing suturing and knot tying tasks, they found that novice trainees over-estimated their basic technical skills when compared to their assessment by a senior surgeon. With regards to surgical residents specifically, Herrera-Almario et al. (2016) evaluated the self-assessment scores of third year general surgery residents using video playback analysis of their laparoscopic skills. The authors found that residents consistently scored themselves lower than faculty scoring both before and after video analysis of their performance.
What is important to point out, especially in terms of how it relates to this research, is that methods to improve self-assessment in learners has been looked at quite extensively. In a systematic review by Colthart et al. (2008), the authors found that selfassessment can be enhanced by feedback, particularly video feedback and verbal feedback, and by providing specific criteria with regards to assessment and benchmarking guidance. This finding of video feedback improving self-assessment is not a new concept, and, in fact was studied specifically by Martin, Regehr, Hodges, & McNaughton (1998). Using videotaped benchmarks of interviewing skills with family medicine residents, the authors were able to show a significant improvement in the resident's selfassessment ability after viewing the video of the benchmark and then their own performance. This was also replicated with medical students and a suturing task in a simulation environment. When Hawkins, Osborne, Schofield, Pournaras, & Chester (2012) evaluated thirty one medical students on this task, they found no difference in self-assessment scores after watching their own video playback. This was significantly different from the self-assessment scores which occurred after video feedback combined with a video benchmark performance. It appears that, when compared against a benchmark, video analysis allows individuals the ability to identify their own strengths and weaknesses in the context of good professional practice.

Chapter Two Summary
This chapter provided an in-depth discussion of the theoretical frameworks and concepts that guided this research project. A brief historical overview on the limitations of human professional ability and the factors that contribute to performance excellence was discussed. In order to provide a basis for understanding human psychomotor development, an overview of two major theories of motor learning, Jack Adams closed loop model of motor learning, and Richard Schmidt's Schema Theory on motor learning was discussed. Both of these theories attempt to explain skilled motor behavior acquisition in humans, in an attempt to move beyond the simple stimulus-response theories that had predominated during the early twentieth century. While Jack Adam's theory introduced the concept of refining a motor act through "knowledge of results," Richard Schmidt's theory further described the difference between acting on "knowledge of results" and "knowledge of performance," in relation to what he termed the generalized motor program (GMP).
Moving on from motor theory, the chapter described the concepts of distributive and deliberate practice as they relate, not only to learning a new skill, but to becoming an expert performer in a particular domain. This concept introduced the 10,000-hour practice effect, but further delineated that in order to achieve gradual, consistent improvement in performance, a combination of motivation, feedback, and repeated practice were needed. In order to identify mastery in performance, however, it was suggested that the Dreyfus model of skill acquisition be introduced to help guide instruction. This developmental model is based on situated performance and identifies five levels of skill; Novice, Advanced Beginner, Competent, Proficient, and Expert.
In order to explore the process of learning through one's experience, the major tenants of Kolb's Experiential Learning Theory (ELT) were explored, including an overview of Kolb's learning style inventory (LSI) which can be helpful in identifying the particular learning style of a student. Noting some of the criticisms of ELT, particularly its lack of emphasis on reflection, the definitions of coaching and mentoring were then explored as a method to help support reflecting on learning. Tying these concepts together, the Purposeful Ongoing Mentoring Model (POMM) was introduced as a way to assist learners improve performance with the help of a more knowledgeable other and reflection. A review of the literature on use of video analysis and structured coaching methods was then discussed in relation to their current usage in surgical residency training programs. The chapter concluded with an overview of the definition of selfassessment and the methodologies that can enhance a learner's ability to accurately assess their performance in an effort to improve practice. The attention will now turn towards a detailed description of the methodology that formed the basis of this research.

Chapter Three: Research Design and Methodology
This study was developed to compare self-assessment video analysis against structured, individualized, video-based coaching as a way to improve technical skill acquisition in junior surgical residents participating in a vascular anastomosis simulation workshop. Based on this concept, the following research questions were used to help guide this research:

Research Design
In order to adequately address the research questions, this study employed a quantitative approach using a between groups experimental research design. Quantitative research is defined by Creswell as an "approach for testing objective theories by examining the relationship among variables. These variables, in turn, can be measured, typically on instruments, so that numbered data can be analyzed using statistical procedures" (2014, p. 4). Quantitative designs are most closely aligned with a postpositivist worldview, and adhere to the scientific method which stipulates that "a researcher begins with a theory, collects data that either supports or refutes the theory, and then makes necessary revisions and conducts additional tests" (Creswell, 2014, p.7).
A quantitative methodology was chosen for this research due to the specific nature of the research questions. According to Creswell (2014), "if the problem calls for (a) the identification of factors that influence an outcome, (b) the utility of an intervention, or (c) understanding the best predictors of outcomes, then a quantitative approach is best" (p. 20). This randomized controlled research project will be implemented using a vascular anastomosis bench model simulation workshop in a large, academic, Level 1 trauma center in the Northeastern United States. Bench models and simulation training have continued to gain traction over the last decade as a means of improving residents' technical skills. The current literature has shown a positive correlation between bench model training and technical competency in the operating room (Wilasrusmee, Lertsithichai, et al., 2007), and have also been associated with improved patient outcomes .

Participants
As part of surgical training, the ACGME requires residents to have access to, and participate in, simulation labs throughout training to "address acquisition and maintenance of skills with a competency-based method of evaluation" (Accreditation Council for Graduate Medical Education, 2019, p. 7). In an effort to help our surgical residents acquire the technical skills necessary to perform a vascular anastomosis, our local residency curriculum requires residents to attend our vascular anastomosis simulation workshop at the beginning of their second year of training, which runs annually from July through August. Because these residents were all required to participate, this study employed convenience sampling by soliciting the entire class of second-year surgical residents scheduled to attend this vascular anastomosis lab (total N=13) for study participation starting in July 2019. As this is a convenience sample, and not a random sample, the resident participants cannot be considered representative of any population (Fraenkel, Wallen, & Hyun, 2011). It is important to note, however, that this research project was conducted using a randomized controlled design. According to Fraenkel, Wallen, and Hyun (2011), "the essential ingredient of a true experimental design is the subjects are randomly assigned to treatment groups" (p.270). All residents enrolled in this research were randomly assigned, using a random number generator, to either the structured, individualized, video-based coaching group (Experimental) or the self-assessment video analysis group (Control). To try to minimize any differences between resident participants from the three departments represented (general surgery, urology, and plastic surgery), the number generator was used three times to assign each department's residents into either the control or experimental group. The facilitator responsible for running the anastomosis lab and providing the coaching sessions, along with the two expert evaluators, are all local, board-certified faculty from the division of vascular surgery, and all have been practicing for a minimum of five years. See the table below for participant attributes:

Variables
An independent variable is defined as a variable "that (probably) cause, influence, or affect outcomes" (Creswell, 2014, p. 52), while dependent variables are "those that depend on the independent variables; they are the outcomes or results of the influence of the independent variables" (p. 52). With regards to the research design of this study, the independent variable being evaluated has two levels, and includes self-assessment video analysis, and structured, individualized, video-based coaching. The dependent variable that is being evaluated is technical skill acquisition and will be measured by performance scores, using the validated MOSAT and GRS evaluation tool, in addition to total time and leak rate, that were obtained at four separate recorded trials that occurred over seven weeks during the 2019 vascular anastomosis workshop.

Instruments
Technical performance on the vascular anastomosis was assessed using the Mini Objective Structured Assessment of Technical Skills (MOSAT) checklist (Appendix F) combined with the Global Rating Scale (GRS, Appendix G). The MOSAT is a detailed checklist consisting of 24 operation-specific actions necessary to perform the vascular anastomosis that participants were required to produce for this simulation lab. These 24 actions are all graded on a binary scale, where the performer gets either zero versus one point depending on if they successfully performed each action. For purposes of this research, the total possible MOSAT score was cut down from 24 to 22 by eliminating the two variables related to "Control of vessel." This was due to the variation of materials provided to participants for this particular anastomosis lab, specifically the Ethicon suturing jig, which already has built in clamps to secure the Graft. The second part of the scoring evaluation, the GRS, consists of eight specific dimensions, each related to some aspect of operative performance, that is also pertinent to this simulation lab exercise.
Each dimension of performance is graded on a 5-point Likert scale with "1" being the lowest possible score and "5" being the highest possible score per category. The highest possible score a participant may receive on the GRS, based on this scoring, is 40.
As this research design is incorporating an already existing instrument, it is important to document the instrument background, as well as its validity and reliability.
As defined by Fraenkel et al. (2011), validity "refers to the degree to which evidence supports any inferences a researcher makes based on the data he or she collects using a particular instrument" (p. 148). These authors also define reliability as "the consistency of the scores obtained-how consistent they are for each individual from one administration of an instrument to another and from one set of items to another" (p. 154).
The MOSAT checklist and GRS have previously been reported to have an inter-rater reliability coefficient of 0.781 and 0.843 (Reznick, Regehr, MacRae, Martin, & McCulloch, 1997). The combined use of rating scales, such as the MOSAT and GRS, have been implemented in most surgical disciplines over the last two decades, and have recently been deemed the best overall approach to assess technical competence in surgical trainees (Szasz et al., 2015). Based on the validity and reliability of these instruments, their applicability to the required task, and their ability to specifically address my research questions, the MOSAT and GRS were ultimately selected for use in this research.
In addition to this scoring, all residents were evaluated on their time to perform each vascular anastomosis and all anastomoses were tested for a leak based on a low fidelity model. Total time to perform the anastomosis was calculated based on the time the first stitch was performed, up until the final knot had been tied and cut by the resident. As all of the trials were video recorded, this total time was calculated and documented based off the video session for accuracy. All anastomoses were also tested for a leak rate after each recording session by clamping the distal ends of the graft to isolate the anastomosis, then perfused with normal saline solution for a total time of 30 seconds. This low-fidelity model is similar to the one our institution reported in 2013 (Okhah, Morrissey, Harrington, Cioffi, & Charpentier, 2013). The saline bag was suspended on a pole at a height of 72 inches from the ground. An intravenous line (10 drops/ml, 104in, Baxter) was used to connect the saline to the clamped graft. This clamped graft was then set in a 1000ml graduate cylinder, resting on a table approximately 30 inches off the ground. A stopwatch was then used to time the 30 seconds allotted for saline perfusion. Any leaked saline captured in the 1000ml beaker during the 30 second perfusion was then transferred to a smaller 100ml graduated cylinder. This provided an accurate assessment of total leak (milliliters), which was then documented on the data collection sheet according to the unique resident identification number.
In order to better quantify a resident's technical performance and attempt to identify participants who have achieved baseline competency in an end to side anastomosis, a scoring system was created by the principal investigator (PI), based upon the variables of MOSAT, GRS, time and leak rate. These variables were combined into a single formula which attempted to place the appropriate weight on the more significant variables (MOSAT and GRS) and less weight on time and leak rate. This score was developed based on the functional skill of novices entering this anastomosis lab and the categorical weight applied to the variables attempts to best represent this. Both the MOSAT and GRS were given a standard multiplier of 2 due to their validity and applicability to the task. The MOSAT, being a technical checklist of the specific actions necessary to perform the anastomosis, was given an additional multiplier of 2. Time was then taken off for total time to complete the anastomosis as well as leak rate. The complete formula that was created for this score can thus be expressed by the following equation: Combined Technical Score = 2 x (MOSAT x 2) + (GRS x 2) -Time -Leak Rate

Procedures
Residents were randomly assigned into one of two groups, a self-assessment video analysis only group (control) and a structured, individualized, video-based coaching group (experimental). Upon presenting to the lab on week one, all residents were instructed on the goals and objectives of the vascular anastomosis lab. In addition, residents were made aware of the intent to use this lab as part of a research project to attempt to determine the effects of Coaching and Self-Assessment on performance improvement and skill acquisition. All residents were then offered a copy of the informational waiver of consent to read and maintain for their records (Appendix H).
Participants were then asked to perform the following task: perform an end-to-side anastomosis of an 8-mm polytetrafluoroethylene (PTFE) graft (LeMaitre Vascular, Inc. Burlington, MA). This anastomosis was to be performed using a continuous running suturing technique, with a 6-0 double armed polypropylene suture with a C-1 needle (Ethicon). Each participant then moved to their individual stations, with a partner, where a video recording was made of them performing this anastomosis on the PTFE graft.
Residents always worked in groups of two, so that as one resident performed the anastomosis, the other resident, sitting across from them, could serve as their assistant.
Each station contained an Ethicon mounting jig, sterile gloves (for deidentification), a 6-0 double armed polypropylene suture with a C-1 needle, and two segments of an 8mm PTFE graft (8cm and 6cm in length). A variable array of instruments was provided, and residents had the ability to choose which instruments they thought were needed to correctly perform this anastomosis (Figure 1). Important to note, the grafts were cut to specified length by the PI, but no graftotomy was created, and grafts were not spatulated prior to the procedure. Graftotomy and spatulation was left up to the discretion of the resident, and this portion of the procedure was not included in the total time for anastomosis. Immediately following the pre-test in week one, all participants received a link to a private, online, video streaming platform of the vascular faculty member discussing the various types of instruments required to perform the vascular anastomosis, the proper suturing technique that is required, and some helpful hints for anastomosis success. Two videos were ultimately made to address these concepts and they can both be viewed here: Video 1, Video 2. There were no restrictions on the amount of times these videos could be reviewed by participants.
Residents returned each of the following six weeks to practice and perform this same anastomosis. Each resident was video recorded a total of four times over the course of the seven weeks; week one (pre-test), then again in week three, week five, with a final (post-test) recording occurring in week seven. Weeks two, four, and six were standard practice sessions and residents were not video recorded at these sessions. All residents received generalized verbal feedback during their anastomosis sessions as per the current standard vascular anastomosis simulation lab protocol (one vascular faculty member for all residents). There were no restrictions on the types of questions regarding performance residents could ask the vascular attending facilitator during these sessions.
Residents in the structured, individualized, video-based coaching group received, in addition to this conventional simulation lab, video-based coaching sessions by the vascular attending, structured around a specific coaching model. These coaching sessions occurred during the non-video weeks (two, four, and six), prior to the practice sessions, for a total of three coaching sessions per resident in the experimental group.
For the purposes of this research design, the GROW model of coaching (Whitmore, 2010) was selected due to its simple design structure and targeted focus. Used by more than 40% of coaching psychologists, the GROW model breaks coaching sessions down into four phases; Goals, Reality, Options, and Wrap-up (Grant, 2011, Appendix I). As a tool used for goal-setting and problem-solving, this model employs an inside-out teaching strategy that assumes learners have unlimited potential which can be unlocked with proper coaching. Despite its widespread use, however, it has been criticized for failing to consider where people are coming from prior to helping them get where they want to go (Bishop, 2015). For this reason, Grant suggests adding "RE" to the acronym, which allows for time to review and evaluate, with the premise that "each coaching session should start with a process of reviewing and evaluating the learnings and actions completed since the last session" (p. 124, Appendix J). This critical aspect of the coaching model allowed the surgical coach to help guide resident reflection through the experiential learning process as described by philosopher Donald Schön, and conceptualized in the POMM.
Video recordings were all deidentified and maintained on a private, online, video streaming platform. Within 48 hours of their recording session, residents in both the control and experimental group were sent a unique email link to their video. This link allowed them unlimited viewing access to assess their technique and perform comparison analysis with the expert video they were provided at the beginning of the anastomosis lab. Upon completion of each recorded anastomosis session (week one, three, and five), residents in both groups were asked to evaluate themselves using the MOSAT and GRS electronic evaluation form.
In addition to the residents performing a self-evaluation of their videos, each video was also sent to the two blinded vascular faculty members for analysis and scoring using the same electronic evaluation form incorporating the MOSAT and GRS. Group randomization in this research design facilitated the faculty blinding technique to better assess if the intervention actually affected the outcome. More specifically, The Cochrane Handbook defines blinding as: the process by which study participants, health providers, and investigators, including people assessing outcomes, are kept unaware of intervention allocations after inclusion of participants into the study. Blinding may reduce the risk that knowledge of which intervention was received, rather than the intervention itself, affects outcomes and assessments of outcomes. (Higgins & Green, 2011, Box 8.11.a) The deidentified videos from each resident trial in weeks one, three, five, and seven were sent to the blinded faculty evaluators through a unique email link the same week they were recorded. There was no specific order in which the videos were sent to the blinded faculty, and the order in which they received each video varied across the sessions.

Validity
Validity can generally be referred to as the level of accountability and legitimacy that is strived through data collection, analysis and interpretation (Onwuegbuzie & Teddlie, 2003). As such, it is an important concept in quantitative research because it refers to the strength of the conclusions that are drawn from the results. More specifically, are the reported results of the research accurate? There are three types of validity that researchers need to account for when performing research, specifically content validity, construct validity, and criterion validity. Content validity is concerned with the extent a research instrument accurately measures all aspects of the content.
Construct validity refers to the extent an instrument measures the intended construct, and criterion validity is the relationship between scores obtained using the instrument or instruments (Fraenkel, Wallen, & Hyun, 2011).
To help establish and maintain content and construct validity, specifically, the three faculty members from the division of vascular surgery met with PI of this research in person, one month prior to the start of the anastomosis lab. During this meeting, the concepts and core objectives of the anastomosis lab were discussed, and the scoring rubric for the MOSAT and GRS, as a measure to quantify technical skill acquisition, was agreed upon. All sections from the MOSAT and GRS were discussed at this meeting and all agreed that the components were necessary actions for correctly performing a vascular anastomosis. After this meeting, the vascular surgeon who was to serve as the coach and facilitator throughout the simulation lab, met separately with the PI for a training session on how to implement the coaching model with the experimental group attending the simulation lab. This faculty member was guided on the specific structure and components of the Re-GROW coaching framework and demonstrated understanding of how to best implement this model into practice (Appendix K), and how to best incorporate specific feedback into each resident coaching session.

Data Collection
The data collection for this research came from several sources. The primary source of data collection was through video capture. A mini action camera was able to be mounted to an articulating arm placing the camera's field of view approximately 12 inches above the Ethicon suturing jig the residents used to perform the anastomosis on.
The recorded video segments from each anastomosis session were all edited and deidentified by the PI, using video editing software, and then uploaded to a private, online, video streaming platform. From this platform, a unique link was created for each video, which was then securely sent through email to each respective resident and the two blinded faculty for evaluation purposes. Only those with the unique link were able to view the video, and no links to the videos were made public. An example anastomosis video can be viewed here: Example Video.
For standardization, security, and ease of data collection, the MOSAT and GRS were combined into one single evaluation form (Appendix L) using our local Research Electronic Data Capture (REDCap). The REDCap evaluation allowed both residents and blinded faculty to quickly evaluate the anastomosis video using either a smartphone, tablet, or personal computer. The evaluations were all maintained on the local REDCap server, which only the PI had access to. This format allowed the PI to maintain the data on a local, secure, web-based server throughout the study, and easily export the deidentified evaluation data into the Statistical Package for the Social Sciences (SPSS) for data analysis. The MOSAT and GRS evaluation form used for this project's data collection can be found here: MOSAT and GRS Eval.

Data Analysis
Appropriate inferential and descriptive and inferential statistical methods were applied based on specific probability distributions for each variable. Results are reported as frequencies, distributions, means, medians, modes, and standard deviations. For continuous data (RQ1A), the two groups (self-assessment video analysis only versus structured, individualized, video-based coaching group) was compared using an independent T test and the multivariate equivalent of repeated measures or mixed ANOVA to assess variability within and between groups. By using this mixed ANOVA, the separate anastomosis trials (RQ2) allowed us to examine the correlation between separate data points and the rate of overall performance improvement experienced by the participants. Comparison in time and leak rates with previous anastomosis labs (RQ4) was evaluated using a Wilcoxon signed-rank test, as the data did not conform to a normal distribution. To assess internal consistency between blinded examiners' MOSAT checklist and GRS score, the inter-rater agreement and reliability was examined by obtaining the Kappa coefficient along with the Pearson correlation, Kendall's tau, and Spearman's rho for comparison. The relationship between variables on RQ1B and RQ3 was nonlinear; therefore, comparison of resident's self-evaluation of technical skill scores with the scores provided by the expert evaluators was analyzed using a Spearman correlation coefficient. All statistical analyses for this study was conducted using IBM SPSS version 25 (released 2017, IBM Corp. Armonk, NY). Statistical significance was set at a p-value equal to or less than 0.05.

Chapter Three Summary
This chapter described the methodology and procedures employed in an effort to compare self-assessment video analysis against structured, individualized, video-based coaching as a way to improve technical skill acquisition in surgical residents participating in a vascular anastomosis simulation workshop. The research questions, research design, sample population, instrumentation and procedures employed in this research were all presented and described accordingly. Additionally, this chapter discussed the data collection process this research utilized, as well as the approach selected for data analysis based on the information attained. The chapter concluded with some of the limitations associated with the project as a whole. The presentation of this data in chapter four will specifically address the research questions proposed, and the general demographic information of the research participants.

Chapter Four: Presentation and Analysis of the Data
Surgical training has changed dramatically over the past two decades in light of concerns over patient safety, resident oversight, and resident well-being. The cumulative effects of these changes, however, surrounding the competency of general surgeons after five years of residency, in this new training paradigm, have recently come to light.
Because of this, there is a greater emphasis on the role of simulation lab training during surgical residency to improve technical skill acquisition, in an effort to efficiently achieve competency in designated core procedures. The purpose of this study was to compare self-assessment video analysis against structured, individualized, video-based coaching as a way to improve technical skill acquisition in novice, junior surgical residents learning how to perform a vascular anastomosis. The results from this research will now be presented and discussed in relation to each of the research questions.

Resident Participation
A total of thirteen (n=13) surgical residents participated in this vascular anastomosis lab which ran for a total of seven weeks during the summer of 2019, from July 8 th through August 23 rd . As the start of a new academic year always commences in the final week of June, each of the residents participating in this research had just begun their second year of surgical residency. This is important to note, as none of the residents had any practical experience performing an end to side vascular anastomosis prior to this, which was the focus of this simulation lab. The majority of the participants (n=9) were made up of residents from the department of general surgery. There were an additional four residents from two other surgical subspecialties who participated in the lab, as this anastomosis skill is also pertinent to their professional development. Two resident participants were from the department of urologic surgery (n=2), and the remaining two residents were from the department of plastic surgery (n=2). According to the 2010 census, women currently make up approximately 50% of graduating medical school students but only 36% of surgical residents and 15% of active general surgeons (Bruce, Battista, Plankey, Johnson, & Marshall, 2015). The distribution of female resident participants in this research was slightly better than this national statistic, with a 46% representation. In 2015, Black Americans represented only 5.7% of graduating medical students and 6.2% of general surgery residents (Abelson et al., 2018). Resident participants in this research mimicked this statistic, accounting for only 7% of the total group. Table 1 provides an overview of the thirteen residents who participated.

Inter-Rater Agreement and Reliability
In order to obtain a benchmark score for technical skill acquisition, two blinded faculty members scored each resident independently using the MOSAT and GRS evaluation tool. In order to obtain a single benchmark score for each resident trial, based off of these two data sets, the inter-rater agreement and inter-rater reliability of the scores obtained from the two blinded faculty members was calculated using Cohens Kappa The same correlation was found between the faculty GRS scoring, with Pearson's (0.80), Kendall's tau (0.66), and Spearman's rho (0.83) all signifying that a strong level of concordance between the scores administered by the two faculty evaluators. Given these findings, the two sets of blinded faculty scores (attending 1 and attending 2) from the MOSAT and the GRS evaluation for each resident, were averaged to provide one single score. This single MOSAT and GRS score was then used as the benchmark score against which each resident would be compared, and would also compare themselves to, as part of their self-assessment and final analysis. See Table 2 for inter-rater agreement and reliability.

Baseline Performance (Pre-Test)
Prior to answering the research questions that guided this study, it is important to consider the baseline performance and ability of residents who participated in this anastomosis lab. Prior to the Pre-Test (week 1), the residents were randomly assigned to either the self-assessment video analysis only group (Non-Coaching) or the structured, individualized, video-based coaching group (Coaching). Participants arrived at the anastomosis lab in week one with no formal instruction and were asked to perform an end to side anastomosis per the protocol. The following Table 3 shows the Means and Standard Deviations for each measured variable after the Pretest performance.  Table 4 and Graph 1 for a detailed overview of the categorical effects between groups and representation of this data. Wilk's test of normality on the studentized residuals. As the ANOVA is considered to be fairly robust with respect to deviations from normality, the PI chose not to transform the data from week five as it likely would not affect the overall results, which are presented next.

MOSAT
With regards to the attending MOSAT scores, there were two outliers in the data, as assessed by inspection of a boxplot for values greater than 1.5 box-lengths from the edge of the box. These outliers occurred only in the experimental group (Coaching), once in week five and once in week seven (Post-Test). Removing these outliers with reanalysis of the data did not change the overall results, so they were both maintained for final analysis. With the exception of week 5 data, the data from week one (Pre-Test), week three, and week seven (Post-Test) was normally distributed, as assessed by Shapiro-  Table 5 and Graph 2 for descriptive and inferential statistical representation of this data. Refer to Table 6 for the mean differences in MOSAT scores compared by week.

Pre-Test
Week 3 Week 5 Post-Test Mauchly's test of sphericity indicated that the assumption of sphericity was met for the two-way interaction, χ 2 (5) = 7.54, p = .185.
The main effect of time to complete the anastomosis showed that there was a statistically significant difference with an increase in time to complete the anastomosis within subjects, F (3, 33) = 3.75, p < .001, partial η 2 = .25. Pairwise analysis (Table 10) shows this difference was between week one (Pre-Test) and Week 3 (trial 2) only.
Overall, there was an increase in time to complete the anastomosis from week one Pre-  Table 9 and Graph 4 for descriptive and inferential statistical representation of this data.  Mauchly's test of sphericity indicated that the assumption of sphericity was met for the two-way interaction, χ 2 (5) = 5.45, p = .365.

Graph 4: Average Time to Complete Anastomosis (Mean)
The main effect of leak rate from the completed anastomosis showed that there was a statistically significant difference in leak rates within subjects, F (3, 33) = 6.23, p = .002, partial η 2 = .362. There was an overall decrease in leak rate from week one (Pre-

Pre-Test
Week 3 Week 5 Post-Test

Combined Technical Score
In order to try and quantify a resident's technical skill acquisition, a combined technical score (CTS) was developed for this project using the objective measurements of MOSAT and GRS scores, time to complete the anastomosis, and leak rate. The formula upon which this CTS was calculated can be expressed by the following formula:

Combined Technical Score = 2 * (MOSAT * 2) + (GRS * 2) -Time -Leak
With regards to the resident's calculated CTS, there was one outlier in the data, as assessed by inspection of a boxplot for values greater than 1.5 box-lengths from the edge of the box. This outlier occurred in the Non-Coaching group in week seven (Post-Test).
Because of the significance of this score, as a marker of technical skill, this outlier was removed from the data set and the data was re-run to assess for any change in significance. This, however, did not change the overall results, so this outlier was maintained in the final analysis.
The data from all four trial weeks were normally distributed, as assessed by  (Table 14) shows that significant differences in CTS occurred between weeks one, three, and five, but not between weeks five and seven. The between-subjects effect of Coaching versus Non-Coaching did not show a significant difference in Attending MOSAT scores, F (1, 11) = 3.50, p = .088, partial η 2 = .241. See Table 13 and Graph 6 for descriptive and inferential statistical representation of this data.  In order to evaluate the effect coaching may have had on self-assessment scoring for residents performing this end-to-side vascular anastomosis, inferential statistics using a two-way repeated measures ANOVA, with between groups design, was performed.

Resident MOSAT
With regards to resident self-assessment MOSAT scores, there were no outliers in this data set, as assessed by inspection of a boxplot.

Coaching
Non-Coaching

Pre-Test
Week 3 Week 5 Post-Test

Resident GRS
With regards to resident self-assessment GRS scores, there were two outliers in the Non-Coaching group. One outlier was in week one (Pre-Test), and the next was in week five. Removing these outliers did not alter the significance of the variable, therefore, they were maintained for final analysis to maintain consistency. The data were normally distributed, as assessed by Shapiro-Wilk's test of normality (p > .05). There was homogeneity of variances (p > .05) and covariances (p > .001), as assessed by Levene's test of homogeneity of variances and Box's M test, respectively. Mauchly's test of sphericity indicated that the assumption of sphericity was met for the two-way interaction, χ 2 (5) = 3.99, p = .553. The main effect of resident self-assessment scores showed that there was a statistically significant difference in GRS scores within subjects, F (3, 33) = 36.48, p < .001, partial η 2 = .768. There was an overall increase in Resident self-assessment GRS scores from week one Pre-Test, (M = 15.85, SD = 3.98) to Week 7 Post-Test (M = 28.23, SD = 6.87), with a statistically significant mean difference of 12.43, 95% CI [17.78, 7.08], p < .001. Pairwise analysis (Table 18) shows that significant differences in Resident GRS scores occurred between weeks one, three, and five, but not between weeks five and seven. The between-subjects effect of Coaching versus Non-Coaching did not show a significant difference in resident self-assessment GRS scores, F (1, 11) = .273, p = .612, partial η 2 = .024. See Table 17 and Graph 8 for descriptive and inferential statistical representation of this data.    Table 19 for between group comparisons.   A Spearman's rank-order correlation was also run to assess the relationship between resident self-assessment GRS scores and Attending GRS scores across four separate trials. Preliminary analysis showed the relationship to be monotonic, as assessed by visual inspection of a scatterplot. There was no statistically significant correlation between resident self-assessment GRS scores and Attending GRS scores across four separate trials, rs(11) = .402, p = .173, rs(11) = -.235, p = .440, , rs(11) = .448, p = .125, , rs(11) = -.389, p = .188, in that order. This correlation did not change significance when participants were isolated into Coaching and Non-Coaching groups and compared to attending scores. See Table 22 for resident and attending scoring for all four trials.

RQ4
To what extent do the current posttest scores on time to fashion anastomosis and leak rate compare with the prior three years of resident vascular anastomosis scores?
Prior vascular anastomosis labs at our current institution have followed a similar, albeit simplified, format in an effort to assess resident's technical skill. Using time to complete the anastomosis and the amount of saline leak from the completed anastomosis provided one measure to assess resident's baseline technical skill (Pre-Test) and compare it with any potential improvement in performance during the final week of the vascular anastomosis lab (Post-Test). The score that had been used to assess differences in Pre-Test and Post-Test was based upon the following formula:

Score = 300 -(Time x 10) -Leak Rate
In an effort to assess how current results from the 2019 vascular anastomosis lab held up to this simplified formula, we compared the prior three years of resident's scores to what the current resident's scores would be with this simplified formula. Using this simple score yielded the following results across years in  As some of the data was not normally distributed, a Wilcoxon signed-rank test Refer to Table 24, Table 25, Table 26, Table 27, and Graph 9 for Z scores and medians for each cohort.      These results suggest there was no difference between structured, individualized, video-based coaching and self-assessment video analysis only (Non-Coaching) as a modality to improve technical skills in a vascular anastomosis simulation lab. The results from this quantitative research will add to the literature on the use of coaching methods, video analysis, and self-assessment of technical skill acquisition in surgical residency programs. Chapter five will provide an interpretation of the data along with a detailed discussion based on the results. Findings from this research will be presented in a manner that extends the knowledge base and theoretical foundations contained in the accompanying literature review. In addition, limitations and suggestions for future research will be presented.

Chapter Five: Discussion, Conclusions, and Implications
The purpose of this research was to employ a validated technical skills checklist to compare structured, individualized, video-based coaching against self-assessment video analysis as a way to improve technical skill acquisition in surgical residents participating in a vascular anastomosis simulation workshop. This IRB approved research was conducted in a large academic, level one trauma center in the Northeastern United States with thirteen second-year junior surgical residents. While all participating residents significantly improved their technical skills related to fashioning an end-to-side vascular anastomosis, this research was unable to show a difference between the structured, individualized, video-based coaching group and the self-assessment only (Non-Coaching) group. A detailed discussion of these results, as they relate to each research question, will be presented next.

Discussion
The policy changes implemented in graduate medical education over the past two decades have forced surgical residency programs to become more efficient in their approach to educating residents. The shift away from a sole reliance on the apprenticeship model of training, and the movement towards the adoption and application of adult learning theory, has compelled surgical educators to rethink the way we teach basic technical skills to residents. In an effort to help residents attain technical competency, which will allow them to independently perform core surgical procedures by their fifth and final chief year of training, strategies such as video analysis and coaching models are continuously being evaluated. The research highlighted here was undertaken as a way to incorporate a variety of theoretical frameworks and core concepts that support human psychomotor development in an effort to improve technical skill acquisition.
This study was designed to evaluate and compare the effect of structured, individualized, video-based coaching against self-assessment video analysis as a way to improve technical skill acquisition. Multiple studies have compared coaching models in surgery against a standard curriculum, or employed video-based analysis with and without feedback, as a methodology for teaching technical skills (Backstein et al., 2004(Backstein et al., , 2005Farquharson, Cresswell, Beard, & Chan, 2013;Singh et al., 2015;Soucisse et al., 2017). To this author's knowledge, however, this is the first study to have investigated whether video-based coaching is superior to self-assessment video analysis. While the data, as presented here, was unable to show a significant difference between the two modalities, there are some fundamental concepts that need be acknowledged and discussed.
Careful attention to the structure and setup of this vascular anastomosis simulation lab was employed in a manner consistent with Kolb's experiential learning theory (1984) serving as the framework around which this research was designed. In order to help our trainees construct their own knowledge and improve their technical skills through a transformative experience, this research employed an evidenced based approach to help guide resident participants through the four different types of abilities required to learn. Providing our participants with a meaningful concrete experience, guiding them through the process of reflective observation, helping them better understand the process through abstract conceptualization, and allowing them to engage in meaningful opportunities for active experimentation was the ultimate goal of this lab.
In order to meet this goal, however, attention to the following concepts was required.
Distributive practice and the deliberate practice effect played a significant role throughout this research project with both the setup and structure of this anastomosis lab.
Distributive practice, as has been previously discussed, has been shown to result in improved acquisition and transfer of technical skills learned in simulation settings (Dawe et al., 2014;Moulton et al., 2006), and allows learned skills to consolidate through sleep between practice sessions (Louie & Wilson, 2001). Deliberate practice, as advocated by Ericsson (Ericsson, 2004), stipulates that practice sessions should be structured around well-defined learning objectives, include detailed feedback on performance, and be guided by error correction and opportunities to improve performance through repeated practice. These concepts were supported throughout this research by holding simulation lab sessions each week for the entirety of this simulation lab, with required attendance on the part of the junior surgical resident participants.
These weekly sessions were all held in three-hour blocks to accommodate participant schedules, with residents typically spending at least one hour in the lab during these sessions. This time allowed each participant repeated practice performing their own end-to-side anastomosis and also providing them with time to serve as an assistant.
While optimum practice schedules to improve performance vary from one domain to the next, research supports practice sessions lasting an hour or less to be optimal for learning (Stefanidis & Heniford, 2009;Van Dongen et al., 2011). Consistent with this, Ericsson found that expert performers typically engage in practice without rest for an hour a day, especially in the early morning, and that concentration is the main factor in this time constraint (Ericsson, 2006). While we did our best to hold morning practice sessions, we did have to schedule practice sessions in the afternoon in order to accommodate resident schedules.
Benchmark demonstration videos were provided to all participants, after the posttest, which provided them with on-demand access to compare their performance to that of the expert. This video was provided to serve as a motivating factor and should have also contributed to deliberate practice throughout the six weeks, helping participants set realistic performance goals for their next session. Using the analytics provided by the private, online, video streaming platform used for video distribution to participants, the benchmark video released at the end of week one was viewed a total of seventy-seven times over the course of the next six practice weeks. This averages out to approximately five views per participant. A second benchmark video was released at the end of week two (view here) to help clarify questions that most of the resident participants had around completing the transition stitch. This video accumulated a total of 40 views over the following five weeks, averaging out to an additional three views per participant. While the number of times participants viewed their own videos for playback analysis across the trial weeks ranged from five to twenty-eight views in total, for an average of twelve views per participant, this ultimately had no effect in post hoc analysis. When participants who viewed their videos ten times or less over the course of the seven weeks (n = 6) were compared to participants who viewed their videos more than ten times (n = 7), there were no statistical significance between these two groups in any of the seven variables recorded throughout this research.
The structure of this lab was also dictated by the tenants of motor learning theory, particularly with regards to the use of providing feedback to participants throughout their scheduled sessions. While Adams Closed Loop Theory of motor learning (1971) has been supplanted by Schmidt's Schema Theory (1975), his emphasis on the role of feedback, error detection, and error correction on psychomotor development remains poignant as it relates to this research. Adams stipulated that in order to learn a correct movement, "the subject needs knowledge of results to inform him about the correctness of the last movement, and response-produced feedback stimuli to inform about the progress of the current movement" (Adams, 1976, p.90). Building off Adams' theory, Schmidt later focused his efforts on determining the factors that most significantly influenced motor learning. In addition to concentration and motivation, Schmidt (2004) noted that an instructor's extrinsic feedback regarding errors "is one of the more important sources of information" ( p. 305), especially as it pertains to knowledge of results. This finding has also been echoed in other research where immediate feedback has been found to be especially useful in correcting inappropriate actions related to procedural and motor skills (D. I. Anderson, Magill, & Sekiya, 2001; J. R. Anderson, Conrad, & Corbett, 1989;Grillo, 1999;Mory, 2004).
In order to implement this type of feedback in this vascular anastomosis simulation lab, the attending surgeon who served as a facilitator served a dual role in providing extrinsic, immediate feedback to the resident participants from both the Coaching and Non-Coaching groups, in an attempt to improve their technical skill.
During the open practice sessions in weeks two through six, guided instruction and feedback was provided in general terms to all participants, in line with the specific learning objectives of the anastomosis lab. Most of this directed feedback was provided when specific questions arose on the part of the participants. For example, one resident, still trying to master the transition stitch during week three, asked the facilitator to watch them perform this part of the anastomosis and help walk them through it. While this feedback was provided to the resident, the other resident participants in the lab also benefitted from the discussion that followed.
In addition to this generalized instruction, the Coaching group had the additional benefit of receiving specific, individualized, feedback from the facilitator based on a video review of their previously recorded performance. During these coaching sessions, the comments and discussion was structured around the Re-GROW coaching framework, and guided by the following questions: • Review/Evaluate: How do you think you did with the anastomosis during this past week? Are you satisfied with your performance?
• Goal: What do you want to work on this week with regards to your performance?
• Reality: What do you think you are doing well? How did you do on the scoring evaluation? Where were your deficiencies?
• Options: How would you like to improve on these deficiencies? If you could only work on one aspect of your performance during the next practice session, what would it be?
• Wrap up/Way forward: Are there any challenges to this plan moving forward?
How can we best work through them?
Residents in this coaching group also had time to address any other questions they may have had related to performing this anastomosis during these structured sessions. To help reinforce this feedback, common themes that arose in these coaching sessions were combined in a summative email after each coaching session and sent to the participants (n = 6). The email sent after the first coaching session included the following reinforcing feedback: 1. Make intentional throws. Try not to drive the needle through the graft, only to realize it wasn't exactly where you wanted it, and then remove the needle to make another stick. This is going to increase your leak rate/cause more bleeding in the operating room.
2. Economy of motion. It's impossible to rotate the patient in the OR to make your suturing easier. Try to get comfortable rotating your body to make your throws easier. You are not confined to one position.
3. Make sure your assistant is providing enough tension on the graft. This allows you to better see what you are doing/improve placement of your bites. Also remember to ensure your assistant is following with the suture to prevent laxity in your closure.
For comparison, and to demonstrate the progression in type of feedback to the coaching participants, the summative email sent after the last coaching session (prior to the Post-Test) included the following reinforcing feedback: 1. Most of your times have improved substantially as your economy of motion has improved dramatically. Remember to fight the urge to grasp the needle with your Gerald forceps, use your Castro for this. As Dr Carruthers pointed out, use your Gerald as a platform to steady your Castro as you grab the needle.
2. You have all done a better job grasping the transition stitch around the heel of the anastomosis. Try not to travel too far as you complete this horizontal mattress suture, as this seems to be the area where most of your significant "leaks" are occurring.
3. You are all doing a much better job with intentionally placing your bites, compared to the first couple of weeks where there were a lot of sticks that ultimately were redone for better suture placement. Remember, your bites should be 1mm, no more than 2mm, apart with each throw. Making smaller, more frequent bites not only increases your time, but also produces more needle holes for leaks.
4. Remember to inspect your anastomosis regularly to make sure your sutures remain under tension and the graft lines up appropriately. It is tough once you get into a rhythm to stop and check things over but try and do this at least a few times throughout the anastomosis, you'll be pleased with your results.
5. Continue to use your assistant to your advantage. Don't let them be a passive "grasper." Tell them how you want them to hold/orient the graft.
Force them to put appropriate tension on the graft and open up your graftotomy to allow for more precise suture placement.
This was the specific intervention strategy developed for this anastomosis lab using the best available evidence surrounding distributive practice schedules, the deliberate practice effect, reflective practice, and motor learning theory with respect to psychomotor development and technical skill acquisition. Further discussion of the results will now be guided by the specific research questions that formed the basis of this This research was unable to show a significant difference in technical skill acquisition between the structured, individualized, video-based coaching group, compared to residents in the self-assessment, video analysis only (Non-Coaching) group.
Before this discussion continues, however, it is important to address the week one Pre-Test data analysis. Upon inspection, and despite participant randomization, Table 4 identified three variables that, when analyzed by an Independent Samples T test, differed significantly between the Coaching and Non-Coaching groups, prior to any intervention being carried out. The three variables in question included; Attending GRS (14.25 vs 17.29), Time to Complete the Anastomosis (24.17 vs 27.14) and Combined Technical Score (-34.67 vs 5.29). Given this information, the question becomes, how should we account for these differences between groups? Should we consider them significant at all? And do these results alter the way we interpret the final results from week seven (Post-Test)?
It is common practice, particularly in clinical trials, to present baseline characteristics of participants, and then run statistical analysis on these groups to show the groups are comparable. Despite this practice, Altman (1985) notes that using statistical testing on baseline comparability should be avoided when proper randomization has occurred as any baseline difference between the groups has a twenty percent chance of occurring simply due to chance. He further states that "performing a significance test to compare baseline variables is to assess the probability of something having occurred by chance when we know that it did occur by chance" (p. 126). Taking this into consideration with respect to this research, we should, however, question why these three variables may have produced a significant difference.
The first question that needs to be addressed, now that Table 4 has shown evidence of baseline differences, is did randomization occur correctly for this research? Elkins (2015) point out that when true randomization is not undertaken, practices such as quasi-random allocation, where participants are assigned based on age, birthdate, odd versus even number, etc., can contribute to systematic differences between groups at baseline. As a random number generator was used in this research to allocate participants to either Coaching or Non-Coaching groups, we should have avoided this potential confounder. The next, and perhaps the most important question is, whether or not the sample size (n =13) contributed to the significant baseline differences noted. To this question, the answer is most certainly yes. A small sample size, like this one, is much more prone to baseline imbalances due to random chance, compared to larger trials with more participants. This point is best made clear by Roberts & Torgerson (1999), when they point out "as the trial size increases, the absolute size of imbalance in baseline characteristics will reduce, owing to a reduction in sampling error. Hence the absolute magnitude of any chance bias in outcome will tend to decrease with sample size" (p.

185). Because this research only employed thirteen participants, I believe this is what
ultimately contributed to the differences of significance that were observed in the Pre-Test.
Finally, with regards to the baseline differences, it is hard to quantify exactly how much previous exposure each resident had to suturing small anastomoses like this one, which could have ultimately contributed to differences in time to perform the anastomosis and, more specifically, individual technique. While all participants were junior surgical residents in their first month of their second year of residency, previous suturing exposure and practice up to that point is extremely variable and could have also contributed to baseline differences. The four resident participants outside of the general surgical specialty (plastics and urologic surgery), although equally distributed across the Coaching and Non-Coaching groups, will also have had some differences in practical exposure to suturing. For this reason, I think these baseline differences are worth mentioning, and I am reporting statistical significance in Table 4 for completeness and transparency of the data. It should be noted, however, that the Consolidated Standards of Reporting Trials (CONSORT) (Moher et al., 2010) does not recommend baseline significance testing as common practice, as they regard it as "superfluous and can mislead investigators and their readers" (p.17). Instead, the recommended practice is for tables to be provided on baseline participant characteristics, allowing readers to use their own judgement to decide if any differences among participants are substantial enough to have potentially influenced the outcome of the research.
Despite these baseline difference between groups, the most important findings of this research relate to the improvement in technical skill experienced by both the Coaching and Non-Coaching group over the course of the seven-week vascular simulation lab. As measured by two expert evaluators using a validated scoring system, significant improvements were appreciated in both groups as evidenced by the Attending MOSAT and Attending GRS scores. With respect to research question two (RQ2), there were also no significant difference in the rate of technical skill acquisition between the two groups at weeks three, five or week seven (Post-Test). It appears that both the Coaching and Non-Coaching group made steady improvement in performance each week. This can best be appreciated in Graph 10 and Graph 11, where the rate of technical skill acquisition appears to run parallel between the two groups. This also held consistent in the within-subjects effect as the neither the interaction between attending MOSAT/attending GRS and Coaching/Non-Coaching reached statistical significance.
While there were no significant differences appreciated when comparing technical skill acquisition between the Coaching and Non-Coaching group, it was encouraging, despite our previous discussion on baseline differences observed in the Pre-Test, to see that there were no significant differences observed between groups in the week seven Post-Test (Table 28). This argues that any differences that may have actually been present at baseline favoring the non-coaching group were eradicated through our intervention, allowing the coaching group to "catch up" with the non-coaching group by week three and beyond.

ATTENDING MOSAT TREND
Coaching Non-Coaching

ATTENDING GRS TREND
Coaching Non-Coaching While the mean difference between the two groups was 3.01 for the MOSAT and 3.04 for the GRS in week one, the mean difference in week seven was 0.13 for the MOSAT and 1.04 for the GRS. It is also interesting to note that the MOSAT scores appeared to level out by week five for both groups, with significant pairwise comparisons evident between weeks one, three and five for both groups but not between weeks five and seven. This was different than the GRS scores, which showed evidence of significant pairwise comparison between all four test weeks. This effect may be explained by the objectivity and binary scoring structure inherent in the MOSAT, compared to the fivepoint Likert scale scoring of the GRS, which can lend itself to more subjectivity by the evaluator. The MOSAT also had a top possible score of twenty-two points, while the GRS had a top possible score of forty points. This larger variation in potential score for the GRS may also have accounted for the significance appreciated across all weeks.
As the variables of time and leak rate are going to be discussed in greater detail when we address research question four (RQ4), here they will be discussed regarding their role in the calculation of the Combined Technical Score (CTS). The CTS was developed for this research project as a way to objectively quantify resident participants technical skill acquisition. While the CTS will need further research to validate this score as a reliable measure of technical skill acquisition for trainees in a vascular anastomosis lab, for the purposes of this research, this score represented the best way to combine the measured variables into a score that would correlate with the Dreyfus Model of Skill acquisition for each resident. In essence, the establishment of this CTS helped serve the dual role of quantifying technical skill acquisition while serving as a criterion to help identify their level of acquired skill. The CTS can be represented by the following formula:

Combined Technical Score = 2 * (MOSAT * 2) + (GRS * 2) -Time -Leak
The ultimate goal for a simulation lab, such as this one, is to provide junior surgical residents with the cognitive and psychomotor skills required to safely and effectively perform procedures in a real-world setting. The Objective Structured Assessment of Technical Skills (OSATS) was created in the late 1990's by surgical educators and researchers at the University of Toronto in an effort to formally assess technical skills in trainees (Hatala, Cook, Brydges, & Hawkins, 2015). By using direct assessment, combined with a task-specific checklist (Martin et al., 1997), the goal was to move away from the underlying assumption that individuals who have performed a set number of procedures are technically "competent." Setting this fixed number of procedures to dictate what the literature refers to as "competency" is not effective, as it ignores the variability in individual psychomotor development. While proficiency-based training, as opposed to time-based training, is now well established in the literature (Ahlberg et al., 2007;Korndorffer Jr et al., 2005;Stefanidis et al., 2005) and has been adopted by most residency programs, the definition of what constitutes competent skill in a trainee has remained elusive for surgical training programs thus far. Up until now, research evaluating the implementation of standardized approaches to help differentiate between competent and noncompetent performers has been lacking (Szasz et al., 2015).
This particular research employed the Dreyfus Model of Skill Acquisition in an effort to address this need and help identify the level of skill our resident participants were ultimately able to achieve by participating in this structured vascular anastomosis lab.
Based on the level of experience these junior resident participants had with performing an end-to-side vascular anastomosis prior to the start of this lab, all of them would correlate with the novice stage of skill acquisition. This stage is governed by learners who seek out strict rules to complete the task and have a limited ability to prioritize and synthesize information (Carraccio et al., 2008). Dreyfus notes that these learners do best when the task is deconstructed (S. E. Dreyfus, 2004)  Resident performance during week three was also based on some basic knowledge of results, as was described by Adams theory of motor learning (Adams, 1971 Week five of the trial produced some of the best overall scores and was the week residents finally seemed to display the proper psychomotor development with regards to constructing this end-to-side anastomosis. Akin to the Dreyfus Advanced Beginner Stage of skill acquisition, resident participant's practical experience with this anastomosis and pattern recognition at this stage was definitely a factor in their overall progress. This research would agree with Benner (Benner, 2004) when she noted the "advanced beginner has a heightened awareness of any feedback on performance and pays close attention to the practice of colleagues" (p.193). This was evident in the resident participants actively seeking out feedback in performance. In fact, the request in week two, on behalf of the residents, regarding proper placement of the transition stitch led to the production of our second benchmark video. The video performances for week five reinforce the fact that participants had reviewed this video multiple times, had practiced this transition stitch in the lab, and were now filtering information and feedback they had previously received, in an effort to produce a quality end-product. Week seven (Post-Test) results are the closest we would be able to come in this lab setting to the level of skill Dreyfus labeled as competent performance. Efficiency was not preserved, as most resident participants increased their time in performing this anastomosis. What they traded in efficiency, however, they made up for in technical performance. Dreyfus and Dreyfus (1986) mention this as they describe the competent performer as becoming more vested in the resultant outcomes based on their actions.
This could be seen during the week seven Post-Test as multiple participants verbalized wanting their anastomosis to be perfect for this final week. For example, when a stitch was throw that didn't fall as they initially had intended, they weighed the decision as it related to re-doing the stitch or making an alternative plan for the next throw to compensate on how the graft would ultimately lie as a completed product. Motor Program (Schmidt, 1975). Schmidt outlined the four steps of storing information included in the production of a goal-oriented movement. These steps include storing the initial conditions, such as proprioceptive information about the positions of one's limbs and body in space; Storing response specifications for the motor program, which include alterations in movement, such as changes in speed and force; Storing the sensory consequences of the response produced, such as actual feedback stimuli received from sensory organs during a task; Storing the outcome information, such as the success of the response in relation to the original intended outcome. The difficulty, however, for training junior surgical residents, is determining how long it takes for this motor program to develop and contribute to automaticity in practice. Older literature seemed to offer ranges to determine "competency" for procedures, such as 50 to as many as 300 cases for flexible gastrointestinal endoscopy (Cass, 1999), and between 10 to 50 cases for a laparoscopic cholecystectomy (Moore & Bennett, 1995).
We know, however, that objective assessments, such as that used in this research are the best way to document and determine competent practice. As surgical training programs need proper documentation of resident participation in simulation activities and the ability to quantify performance to determine competent practice, this research provides us with some useful data. With regards to both the Coaching and Non-Coaching groups participation in this vascular anastomosis lab, we can use the CTS to help identify resident performance according to the Dreyfus model of skill acquisition. While this scoring system will need to be validated with future research and resident performance in next year's vascular anastomosis lab, the following scoring model based on CTS will be used to identify level of technical skill acquisition: for most performers, as we feel this best represents the stages that residents will move through during this simulation lab. We have also intentionally set up this scoring system so that an expert will score above 100 in the CTS. learning. This automaticity exists in a simulated environment, under constant, unchanging conditions. This level is different from that of a proficient or expert performer who, in a real-world, ever-changing environment, is able to maintain this automaticity at first by analytically thinking about their situation (proficient) and then, with more experience, will be able to perform this anastomosis in the most critical of circumstances without thinking about their actions (expert). This year's participants reached a mean CTS of 65.08 by week seven, which is just within the competent stage of skill acquisition, based on the scoring system we have proposed. Moving forward, our program can now use this CTS for future vascular anastomosis labs in order to serve two purposes. The first is to identify when a resident's performance is consistent with competent practice, which will allow them to participate in the operating room assisting an attending (expert) with this anastomosis. In the same respect, the CTS may also be used to help identify non-competent practice at the end of a scheduled seven-week lab.
This designation would mandate that any resident who scored under this threshold would need to attend more practice sessions, until a CTS of 60 or better was attained, before they would be allowed to participate in the operative environment. As previously discussed, Boud (2013) defined self-assessment as ''the involvement of students in identifying standards and/or criteria to apply to their work and making judgements about the extent to which they have met these criteria and standards'' (p. 12). This definition was instrumental in guiding the methodology of this research as it pertained to evaluating the ability of junior surgical residents to self-assess their own technical skill acquisition. While the majority of evidence suggests that physicians have very limited ability to self-assess (Davis et al., 2006), multiple efforts have been made to try and improve this ability through the provision of benchmarks or explicit anchors for evaluation criteria (Ward et al., 2002). More recent research performed by Bonrath et al. (2015), demonstrated better self-assessment skills for residents who were assigned to a surgical coaching group. The research presented here, however, was unable to replicate this finding. As reported in chapter four, both the Coaching and Non-Coaching groups showed significant overall improvement in MOSAT and GRS self-assessment scoring across weeks one, three, five, and seven, but there was no between groups significance in self-assessment scores appreciated. When broken down by groups, the correlation coefficients with Attending MOSAT and GRS scores for the Coaching group ranged from -.580 to .429, while correlations for the Non-Coaching group ranged from -.604 to .687. This variability is likely due to the inherent flaws in methodology that arise when trying to evaluate self-assessment, particularly regarding the validity and reliability of the gold standard score, which will be discussed in greater detail with respect to research question three (RQ3).
Despite this weak to moderate correlation with Attending MOSAT and GRS scores, resident self-assessment scores for both the Coaching and the Non-Coaching groups significantly increased across weeks. Mean scores for the Resident MOSAT increased from 11.77 in week one to 19.77 in week seven, while the mean scores for the Resident GRS increased from 15.85 in week one to 28.23 in week seven. These results appeared to level out by week five (mean MOSAT 19.38,mean GRS 26.62), with no significance difference noted between weeks five and seven. Comparative analysis using an independent samples T test also showed that self-assessment scores between the Coaching Group and the Non-Coaching group were not statistically significant from one another across any week. One of the reasons Bonrath et al (2015) may have been able to show a difference in self-assessment scores for residents who were surgically coached is because only the residents randomized to the coaching group had the ability to view video playback analysis of their performance. This is an important point, as this video benchmark of their performance likely contributed to reflective practice, as the authors point out, which may have allowed them to better assess their own performance. The research presented here, however, provided both the Coaching, and the Non-Coaching groups with an opportunity to not only evaluate their own performance using video analysis, but also compare their performance to that of an expert. The steady improvement in Resident MOSAT and GRS self-assessment scores observed in this research may be best explained through what Donald Schön refers to as both reflectionin-action and reflection-on-action (Schön, 2017).
Reflection-in-action is a cognitive real-time approach one uses to analyze a situation and adapt our thoughts and actions to the requirements of the change one is trying to achieve. When engaging in reflection-in-action, one will stop during the action, make adjustments, and alter methods to improve their practice as necessary. Schön (1987) further clarified this: "What distinguished reflection-in-action from other kinds of reflection is its immediate significance for action" (p.29). This specific type of reflection was frequently witnessed over the course of this research, particularly during the practice sessions. As resident participants were trying to complete their anastomosis, discussion would often take place between residents over the correct steps of the procedure, especially during the first few weeks of the anastomosis lab. Suture placement was discussed extensively during the early weeks, especially with regards to weighing the effects of a wrong throw of the suture against redoing the suture throw altogether in an effort to correct suture placement but potentially impacting the leak test by making an additional hole in the graft.
Reflection-on-action, however, is a post-analysis of our action once we have completed the activity (Eva & Regehr, 2005 against which they could compare their own video to, and reflect upon, in an effort to formulate goals and objectives to help improve their performance during the next lab session. Not only has this type of practice been shown to be effective, but literature has also shown that making mistakes and reflecting upon them in this type of safe setting is one of the best ways to learn (Agha & Fowler, 2015).
Benchmark videos alone, however, may not have been the only factor contributing to the reflective practice in our trainees, as a coach or a facilitator can also be instrumental in this process. This was the purpose of structuring this research around the POMM (Peno & Silva Mangiante, 2012) to help promote reflection, particularly in our structured, individualized, video-based coaching group. While we were unable to show a difference in self-assessment scores between the two groups, both showed significant overall improvement in self-assessment scoring, and both were exposed to the coach/facilitator to varying degrees. For the Coaching group, reflection was specifically addressed in separate, open discussion that was structured around the Re-GROW coaching framework. In the weekly lab sessions, the facilitator openly discussed options to improve technical performance with all participants, which included discussions on comparing their performance with the benchmark video.
Although it was not measured, and is tough to control for, it can be speculated that both processes allowed for the coach/facilitator to play an active role in all of the participant's technical skill acquisition through both direct and indirect promotion of a reflective process. As Schön himself notes, "Every attempt to produce an instruction is an experiment that tests both the coach's reflection on his own knowing-in-action and his understanding of the student's difficulty" (1983, p. 104). While we know that both instructional methods (coaching and providing benchmark videos) have individually been shown to confer a benefit with self-assessment scoring in previous research (Bonrath et al., 2015;Martin et al., 1998), it becomes difficult to differentiate how much each methodology did or did not contribute to the improvement in self-assessment scores seen in this research. Future research may be directed at trying to measure the overall effect on reflective practice that can be established using coaching frameworks versus benchmark videos alone, or in combination, to promote self-assessment. and materials, but also in the time commitment required by faculty to staff these simulation labs as facilitators and evaluators. This time commitment makes the process significantly more resource intensive in an effort to benefit a select few residents (Kneebone et al., 2010). In the vascular anastomosis lab used in this research, the availability of three vascular surgeons was required throughout the seven weeks, one to serve as the facilitator and coach to the participating residents, and two to serve as the blinded video reviewers and evaluators.
The time demands to run this lab were substantial, with the facilitator spending four hours each week for a total of five weeks overseeing the lab, in addition to reviewing the coaching group videos. The video evaluations alone were a considerable time commitment for the two blinded faculty, especially given the fact that each of the thirteen resident videos ran anywhere from twenty to forty minutes long, with a total of fifty-six videos produced over the course of seven weeks. Herein lies the ulterior benefit of simulation-based training. Faculty demands could ultimately be lessened, to some extent, if simulation-based activities did not have to be solely faculty driven, but could also be self-directed by the resident participants themselves (Arora et al., 2011). Selfassessment, in this respect, has the potential to serve the dual role of assessing one's own learning and performance, while also helping to offset some of the time commitments required by faculty.
This research, however, aligns nicely with what previous research in this area suggests; simple correlation would allow you to believe that there is typically very little agreement between self-assessment and expert assessments (Davis et al., 2006;Falchikov & Boud, 1989). Because of the consistency of these findings, it has become very difficult to know exactly how to implement self-assessment as an effective teaching modality in surgical training programs. Looking at the correlation coefficients obtained in this research (Table 29) comparing Attending MOSAT and GRS scores to Resident MOSAT and GRS scores over the four trial weeks, the following correlational trend was observed:

GRS Correlation Coefficients
Week 1 Week 3 Week 5 Week 7 rs = .402 rs = -.235 rs = .448 rs = -.389 According to these results, the only weeks that reached a moderate correlation, based on a Spearman rank correlation, was the GRS evaluation in week one with rs = .402, and MOSAT and GRS correlation in week five with rs = .478 and rs = .448 respectively. Outside of these weeks, the overall weak correlations observed are in agreement with the majority of the literature done on self-assessment. It is also worth noting that none of these correlations between Attending and Resident MOSAT and GRS scores achieved statistical significance. What this means for this particular research is that we cannot be sure (>5% chance) that the strength of the relationship did not happen due to chance alone, and perhaps this is the reason we had a moderate correlation in week one GRS (rs = .402), while the week one MOSAT was very weak (rs = .180). This, again, is a problem encountered due to the small number of participants recruited for this study, as statistical significance is more dependent on the sample size than the correlational coefficient. Despite this, however, there are some positive attributes within this particular data set. The moderate correlations appreciated in week five (rs = .478, rs = .448), although not statistically significant, could be indicative of a trend towards the residents becoming better at their own self-assessment over time.
Contrary to the research performed by Rizan, Ansell, Tilston, Warren, & Torkington (2015), providing participants with benchmark videos for this anastomosis lab did not appear to improve the overall correlations between resident self-assessment and faculty expert assessment. I do not think, however, that our data shows that resident self-assessment is lacking or was not a factor in resident's overall improvement in technical skill acquisition across the seven weeks of the lab. This researcher would agree with Ward et al., (2002) that there exists the potential of a flawed methodology when assessing this relationship through simple correlational analysis. One of the problems the authors identify as a confounder is related to the assumption that the reliability and validity of using experts to serve as the gold standard should be viewed without skepticism. Here they note "even if one assumes that clinical supervisors are, in fact, accurate (valid) judges of cognitive achievement, any conclusion with respect to the accuracy of self-assessment presumes that experts are providing a fair measure of clinical performance" (p.68). If we look at the initial data from this research, the Kappa coefficient between the two expert evaluators was, in fact, very weak at 0.12 for MOSAT and 0.01 for GRS. The concordance between these scores, however, was high for Pearson Correlation (0.86, 0.80), Kendall's Tau (0.70, 0.66) and Spearman rho (0.84, 0.83). This occurred because although one of my evaluators was stricter at grading than the other evaluator, their scores both correlated with one another. In other words, as one evaluators score increased, so did the other, and vice versa, even if the actual number was different. Because of this concordance, it was decided that the average of the two scores be used as the "gold standard" score for each resident. Using a third evaluator could have made this evaluation process more robust, and perhaps may have improved my inter-rater agreement, but this was not possible due to faculty availability.
Another factor Ward et al. (2002) identify as a methodological issue that can arise when comparing self-assessment scores to those of an expert is the assumption that all students are evaluating themselves by tapping into the same aspect of their performance.
These within group differences have rarely been evaluated in research but should be accounted for. We did not formally review each question on the MOSAT and GRS evaluation form with the resident participants prior to them filling out these evaluations.
This could have helped clarify the particular aspects of technical performance we were evaluating, specifically on the GRS, as this evaluation left more room for performance interpretation than the MOSAT.
Despite this overall weak correlation between faculty and residents, there was a consistent trend in the improvement of self-evaluation scores over time, particularly by week five (MOSAT = .478, GRS = .448). This unfortunately changed in week seven, with the scores producing a negative correlation, which means that the variables are inversely related, signifying that as one score increased, it's correlated score decreased.
This negative correlation could be due to the observation that residents were, overall, much less satisfied with their results for the Post-Test compared to their week five performance. Going forward with this lab, it would be a suggestion to not label the final week as a Post-Test, and simply refer to it as week seven. Labeling it as a Post-Test seemed to add a degree of pressure on the participants that was not present in the preceding weeks. The participants described being more nervous, as if they were actually taking a test during this week, more so than the prior weeks. This may have created more anxiety on the part of the participants, causing them to feel like they were making more overall mistakes, in addition to adding to their anastomosis time and increasing their leak rates, which was ultimately reflected in the way they self-evaluated themselves.
Given the significant improvement displayed by residents from Pre-Test to Post-Test and the overall positive trend in resident self-evaluation scores that appear to trend somewhat with faculty evaluation scores, findings from this research would include the recommendation that this aspect of self-evaluative performance review be maintained in future vascular anastomosis labs. A similar research project evaluating the role of selfassessment in a vascular anastomosis lab also found self-assessment and expert assessment correlated poorly, yet the authors concluded that "self-assessment with expert feedback throughout training appears to offer an efficient method of improving the technical performance of surgical trainees as an integral part of a structured surgical training program" (Pandey et al., 2008, p. 289). The research presented here would agree with this statement and our institution will continue to employ this combined modality in an effort to improve technical skill acquisition in junior surgical residents. A future research project would be to explore the role of peer evaluations on technical performance, which has previously been shown in several disciplines, including surgery, to be more accurate than self-assessment (Falchikov & Goldfinch, 2000;Risucci, Tortolani, & Ward, 1989). This could be another valuable strategy in helping to gauge performance improvement in junior surgical residents and a potential way to help reduce the faculty resource utilization these simulation labs require.

RQ4: To what extent do the current posttest scores on time to fashion anastomosis and
leak rate compare with the prior three years of resident vascular anastomosis scores?
The goal of this fourth research question was twofold: First, because of the variables collected as part of this research (time and leak rate), it provided us with a potential way to compare resident performance from previous vascular anastomosis labs at our institution with this year's lab, which employed a new format including benchmark videos, video playback analysis, and a validated scoring system, both with and without coaching. Second, depending on the analysis, it could provide us with some insight into the outcomes our department had been measuring previously, to see if we should permanently alter the format of our annual vascular anastomosis lab moving forward.
This research question ultimately proved to be one of the more interesting findings of this research after final analysis, as will be discussed.
Our department of surgery previously published results in 2012 attempting to identify objective procedural end-product metrics for surgical residents participating in a vascular anastomosis simulation lab (Okhah et al., 2013). The two main metrics that were identified in this research were time to complete the anastomosis and saline leak rate. In addition to these performance measures, four technical errors including suture technique, locking sutures, air knots, and broken sutures were also identified as a way to track performance during the course of this vascular anastomosis lab, which consisted of three practice weeks, in addition to the Pre-Test and Post-test. Based on this published data, in an attempt to quantify any improvement in a resident's performance during their participation in our vascular anastomosis lab, a simplified scoring system was developed and implemented in 2013. While this score was developed with the intention to rank residents according to performance, it ultimately became tough to use this score as a true marker of technical skill. The first problem was the inconsistently and subjectivity in deducting points for each technical error made from the final score during the Pre and Post-Test. For example, if a resident scored a 50 based on the equation "Score = 300 -(Time x 10) -Leak Rate," and they also made three technical errors while performing their anastomosis, their score would actually be 47. These technical errors were also randomly chosen to be included in the analysis and had not been previously validated as a marker of technical skill related to a vascular anastomosis. In addition to this, this simplified score placed the majority of weight on the variables time and leak, which, as we will discuss, may not be the best measures of technical performance.
The second problem with this simplified score was the requirement that residents be limited to only thirty minutes to complete their anastomosis. As there are a significant percentage of residents who ultimately take more than thirty minutes to perform this anastomosis (30% of residents in the 2019 class), this time constraint always prevented a percentage of residents from completing their anastomosis, or caused them to rush and make more technical errors, which would then alter their final score. The third, and perhaps most notable problem with this simplified score, was related to the data analysis over the course of the last four vascular anastomosis labs (which included the current 2019 data). Although our current anastomosis lab participants showed significant improvement in technical skill over the course of our anastomosis lab, based on attending and resident MOSAT and GRS, if we based their score solely on the simplified score, the opposite was true. 2019 resident participants appear to have regressed in their technical skills over the course of this lab according to the simplified score, with a median decrease of 33 in their simplified score from pre-intervention (Mdn = 117) to post-intervention (Mdn = 84). The same was true when we compared the prior three years of simplified score data (2016, 2017, and 2018)  We now know, based on the objective data presented in this research, which was structured around a validated technical skills checklist in addition to variables of time and leak rate, that our residents do, in fact, improve their technical skill acquisition over the course of a seven week vascular anastomosis simulation lab. The research presented here, however, now calls into question the content validity of the simplified score.
Content validity relates to this simplified score's ability to accurately represent a resident's technical skill acquisition, related to creating a vascular anastomosis, based primarily on the factors of time to complete the anastomosis and measured leak rate.
While time is an important factor in any surgical procedure, this specific variable has been looked at previously in the literature with regards to outcome. Datta et al., (2002) attempted to objectively measure whether differences in a surgeon's manual dexterity had any impact on a simulated vascular surgical procedure. While they found that trainees with better manual dexterity produced better outcomes with regards to anastomotic leak and degree of stricture, they also found that time taken to perform the procedure did not have any influence on these outcomes. In a similar manner, when Jensen et al., (2008)  The second major component of this simplified score, which had been used as a marker of resident's technical skill acquisition, is the anastomotic leak rate. While various models have been described to train residents in performing a vascular anastomosis, materials used typically include both real tissues such as porcine aortas, and artificial tissue like silicone, Gore-Tex, and PTFE (Jensen et al., 2008;S. Schwartz et al., 2014;Sidhu, Park, Brydges, MacRae, & Dubrowski, 2007;Wilasrusmee, Lertsithichai, et al., 2007). The most difficult aspect of working with artificial tissue, particularly the PTFE graft used in this research, is dealing with needle hole leaks. Needle hole leaks are a known problem that commonly exist in situations where prosthetic grafts are used (Baker, 1987). To overcome this limitation, companies have made sutures, such as polypropylene, with smaller needle-to-suture diameter ratios that approach 1:1, which were the sutures used in this research. While this ratio theoretically allows the suture to fill the needle hole, research has shown that although leak rates are reduced, they are still present (Dang et al., 1990). This can be a significant problem in real world settings, where vessel trauma and bleeding can compromise outcomes. To combat this, there has been significant research on how to best deal with the needle hole leaks in vivo, such as employing the use of surgical sealants such as fibrin glue, thrombin, and protein-based adhesives (Rogers, Turley, Cross, & McMonagle, 2016).
For purposes of this vascular anastomosis lab, we have to acknowledge that despite a near-perfect anastomosis, needle holes ultimately contributed to some amount of leak upon completion, despite proper technique. For example, when our attending vascular surgeon and lab facilitator performed his own anastomosis using the same materials as resident participants, the amount of saline leak recorded was 28ml. These needle-hole leaks can best be appreciated by viewing this example video where the straight lines of saline exiting the graft signify needle hole leaks (play here). While some researchers have tried to simulate the viscosity of blood during an anastomotic leak test, such as the one we used, by using vegetable oil (Datta et al., 2002), saline is easier to work with and more readily available in simulation labs. Saline will, however, flow more freely through these needle holes than would vegetable oil or blood. As measured with a wide-open incomplete anastomosis, the saline leak test used in our research produced a total of 100ml of saline over thirty seconds. For comparison, our resident participants experienced average leak rates of 63ml during the week one Pre-Test, 59ml during week three, 47ml during week five, and 48ml during the week seven Post-Test.
Given this information on time to complete the anastomosis and leak rates, this research has shown that a simplified score is not an accurate way to quantify a resident's technical skill acquisition, specifically as it relates to performing a vascular end-to-side anastomosis. Looking at the past years of resident data that was recorded in our vascular anastomosis lab, there is very little that we can take away from this. Our current resident participants displayed significant improvement in their technical skills as measured by an objective, validated scoring system, yet showed no improvement when evaluated by the simplified score. It is very likely that previous resident participants, particularly those from 2016, and 2018, who showed no improvement in their simplified score from Pre-Test to Post-Test, also improved their vascular anastomosis technical skill over the course of the lab, but valid measures were not in place to accurately quantify this improvement.
Thus, a simplified score cannot be used for measuring or quantifying a resident's technical skill in a vascular anastomosis lab, as it lacks the content validity required for this assessment.
The importance of this research question lies in the fact that we now have the local evidence needed to institute the more robust comprehensive technical score (CTS) for use in our vascular anastomosis simulation lab moving forward. The CTS has been shown, in this research, to be a more reliable indicator of our ability to measure and quantify the degree of a resident's technical skill acquisition across weeks. Its implementation in future vascular anastomosis labs may help us better track resident progress and degree of competency upon completing the lab, while also helping us identify residents who may need further practice in the simulation setting prior to assisting with this anastomosis in the operative environment. Further research is needed to validate this scoring system during the next vascular anastomosis lab in 2020.

Final Analysis
As this research was unable to show a significant difference between the two modalities implemented, the first major question we have to address regarding this research is what exactly contributed to the improved performance in technical skill acquisition experienced by residents in both the structured, individualized, video-based coaching group and the self-assessment video analysis only group? Having explored each research question in detail and reviewing the descriptive and inferential statistics obtained from the lab, there is unlikely a single answer to this question.
We have to acknowledge the fact that there was going to be some improvement in performance simply based on resident practice alone. Even without proper instruction, residents would likely have improved their technique across the seven weeks to varying degrees. We also know that residents who participate in basic, standardized vascular simulation labs show improvement in overall performance, while those who adhere to a distributive practice schedule, such as that used in this research, can further enhance this performance (Moulton et al., 2006). These distributive practice sessions alone, however, cannot completely account for the degree of improvement observed in both attending and resident self-assessed MOSAT and GRS scores.
The benchmark videos provided to all participants, along with the playback analysis and video self-evaluation scoring also likely played a significant role in the technical skill acquisition that was appreciated by all junior surgical residents participating in this research. Based on the number of views both the benchmark videos and individual videos accumulated over the course of the lab, it would be hard to excuse this effect. While participant motivation and reflective practice was not measured in this research, the overall participant response to the structure of this lab incorporating these on-demand videos was overwhelmingly positive. How much these videos further promoted individual reflective practice, which has been shown to have a significant effect on technical skill acquisition (Stefanidis, Korndorffer, Heniford, & Scott, 2007), is not known. This is especially pertinent to the self-assessment, video assessment only group, who showed similar improvement in performance as the structured, individualized, video-based coaching group.
The second major question we have to address is why this research was unable to show a statistically significant difference in technical skill acquisition between residents in a structured, individualized, video-based coaching group, compared to residents in a self-assessment, video analysis only group. As previously discussed, the use of coaching frameworks both with and without video analysis, as well as video-based self-assessment have been previously shown to improve technical skills in surgical settings to a greater extent than when these modalities are individually compared to standardized instruction (Bonrath et al., 2015;Farquharson et al., 2013;Singh et al., 2015;Soucisse et al., 2017).
The one area in the literature that had yet to be explored at the time of this research project was comparing video-based coaching to self-assessment video analysis, to determine if one was superior over the other. As both modalities are unique and can each contribute to technical skill acquisition in various ways, especially with regards to reflective practice, we were likely unable to show a difference between the two modalities due to the low number of resident participants (n=13) enrolled in this research.
As the control group only missed out on the additional video-based coaching and email reinforcement, this research was underpowered to detect a potential difference between the experimental and control group. Unless a large, multicentered trial can be coordinated, it is unlikely that a difference between video-based coaching and selfassessment video analysis will be able to be elucidated.
Another issue that may have contributed to this research's inability to show a difference between structured, individualized, video-based coaching, compared to residents in a self-assessment, video analysis only group is resident interaction. While the researcher did their best to randomize residents and maintain segregation of the video-based coaching sessions outside of normal practice sessions, the researcher could not isolate both groups during practice and video sessions due to resident schedules and availability. Because of this, residents from the Coaching group may have acted as an assistant to residents in the Non-Coaching group and vice versa. While residents in the coaching group were encouraged not to talk about the specifics of the framework or coaching discussions with the control group, discussion held between residents during practice sessions (or outside of the lab) could have touched upon some of this methodology, or led to reflective discussions that may have eradicated any potential benefits conferred by video-based coaching.

Implications
Although this research was unable to show a significant difference between structured, individualized video-based coaching, and self-assessment video analysis only, there are some meaningful outcomes with regards to this research that can be inferred both locally and nationally. First, this research adds to the literature on the use of video analysis and coaching in surgical simulation labs, as a means to improve junior resident's technical skill acquisition. This research also adds to the literature on the ability of residents to self-assess their own performance in simulation labs. Although correlations between attending and resident scores were weak to moderate at best, resident selfassessment scores improved significantly across weeks. This research finding opens the door for more research into this area, particularly with regards to peer to peer assessment and evaluation in surgical simulation labs.
The local effects of this research are perhaps even more significant. As our fundamentals in laparoscopy (FLS), endoscopy (FES) and robotic curriculums are constantly being updated and refreshed with content, the benchmark, on-demand videos that were used in this research will now play a significant role in future simulation labs and curriculum development within our department. Residents who participated in this research have been a driving force in this movement, as they felt the videos helped them significantly throughout the vascular anastomosis lab. As a result of this, benchmark instructional videos on laparoscopic cholecystectomy and laparoscopic appendectomy are currently being prepared by the author of this research. Updated vascular anastomosis videos will also be created for next year's simulation lab. The vascular attendings who were not a part of this research, but who viewed the benchmark videos, now want to produce their own videos to help educate residents on how to perform two other vascular anastomoses, specifically an end-to-end anastomosis and how to properly parachute an anastomosis. The overarching goal, stemming from this research, is to continue building a local library of on-demand videos that residents may access at their leisure, that have been created and approved by local surgical experts, who can teach the key steps and proper technique of a specific procedure prior to a simulation or real-world experience.
Due to the overwhelmingly positive response by the residents who participated in this study, a recommendation for future research would be to include qualitative analysis into the study design. As mentioned previously, multiple residents from both the Coaching and Non-Coaching group commented throughout the lab as to how much they appreciated the methodologies employed and how much these videos added to their simulation experience and technical skill acquisition. Having formal documentation of these resident responses would have added to the overall strength of this study.
This research has also contributed to the development of a comprehensive technical score (CTS), to be used in our vascular anastomosis lab, in an effort to quantify a resident's technical skill acquisition according to the Dreyfus model. While further research is needed to validate this score, it can now be used as a local guide to assess what stage of skill acquisition residents are at as they progress through the next vascular simulation lab. If, by the final week of the simulation lab, a resident score is below the threshold for competent practice (CTS < 60), they would be required to complete more simulation sessions until they are able to reliably achieve a baseline competency score.
While this research was unable to show a difference between residents exposed to a video-based coaching framework compared to those in a self-assessment, video analysis only group, the fundamentals of the two modalities have proven to play an important role in surgical simulation and surgical training. This research has shown that both modalities can be easily incorporated into a simulation lab format and produce meaningful results.
It will be the authors recommendation that our department provide residents with exposure to the Re-Grow coaching framework, combined with a self-assessment of technical skill, to help reinforce reflective practice in next year's vascular anastomosis lab.
Lastly, our previous anastomosis labs held within our department ran over an average of five weeks. After taking the pre-test and the post-test weeks into account, this allowed participants to have only three practice sessions with the facilitator to help improve their technique. Based on the resident performance data documented in this research, where resident's technical skills did not start to plateau until week five, three practice weeks is unlikely to provide residents with enough distributive practice sessions to increase their performance to the level of competency designated in this research.
While there is no agreed upon length of a simulation lab, as number or practice sessions do not confer competency, this author would advocate for no less than four practice sessions in between the Pre-Test and Post-Test, for a minimum total of six weeks in future versions of this vascular anastomosis lab.

Limitations
Despite the attempt at a rigorous research design, there are some limitations associated with this project. The first limitation is the single-centered research design employed. This single center design limits our ability to generalize the findings of this research to other surgical residency programs. The second limitation is the small number of residents enrolled in this trial (n=13). Because of this low number of participants, concern can be raised whether or not this study was adequately powered to detect a difference between the two independent variables. There was no way to increase this number, however, as this research was limited by the availability of second year residents in our training program who are required to attend this anastomosis lab. Multiple measures on each participant (four video session per resident) were collected as an attempt to help offset this limitation.
In addition to this small number of participants, and their friendly interaction outside of this anastomosis lab, the control group may become more highly motivated to perform at a higher level in order to be competitive with the experimental group. This phenomenon is known as compensatory rivalry and is difficult to control. There were two residents who were assigned to the self-assessment video analysis only group who questioned whether or not they could be reassigned to the video-based coaching group during week one. After reassurance that they would receive all coaching-based materials and an opportunity to have their own one-on-one sessions with the coach and facilitator at the conclusion of research, they did express their satisfaction with this plan. In order to adjust for this compensatory rivalry however, a recommendation for future research would be to measure motivation in both groups and statistically adjust for this using an analysis of covariance (ANCOVA). Lastly, maturation is always a threat to internal validity in experimental designs. Over the course of the seven weeks during which this research takes place, participants should normally improve upon their technical skills over time simply due to practice. Random assignment of participants will help to offset this.

Conclusion
Surgical training has changed dramatically over the past two decades in light of concerns over patient safety, resident oversight, and resident well-being. Because of this change, residency programs have recognized that the cognitive apprenticeship model, pioneered by William Stewart Halsted in the early 20 th century, and based on the "see one, do one, teach one" method, may no longer be the most effective strategy to prepare surgical residents for practice (Gallagher et al., 2012). While policy implementations by the ACGME, such as a reduced hourly workweek and competency based medical education, have been effective in addressing oversight and improving resident well-being, the cumulative effect of these implementations have called into question the technical competency of residents graduating from surgical training programs. As studies have since shown a positive correlation between simulation training and technical competency in the operative setting, surgical training programs have placed a greater emphasis on using simulation bench model workshops to improve junior resident's performance on competency-based procedural skills, such as performing a vascular anastomosis.
This research attempted to evaluate if structured, individualized, video-based coaching improved technical skill acquisition to a greater extent than self-assessment, video analysis only in junior surgical residents attending a vascular anastomosis simulation lab. While I was unable to show a difference in technical skill acquisition between these groups, significant improvements in performance was witnessed over the course of seven weeks. Benchmark videos did not seem to improve the ability of resident's self-assessment of technical skills, as correlation with expert evaluator scores were weak to moderate throughout. This research was able to develop a scoring system to better quantify and define technical skill acquisition according to the Dreyfus model.
While this score needs further validation, it may better help surgical training programs to define competency in resident participants prior to participation in clinical practice settings. This research adds to the overall literature on the development of psychomotor skill, video playback analysis, self-assessment, coaching, and competency-based performance as defined by the Dreyfus model of skill acquisition.
Please complete the e aluation belo .