The Effects of Sleep Deprivation, Caffeine, and Alcohol on Simulated Neurosurgical Performance

Becoming a safe and competent neurosurgeon is becoming more and more challenging given the changing health care environment. Three primary reasons affect this. First, the rapid pace of research has led to an increase in the breadth of knowledge that must be comprehended in order to become a master. Second, regulations in surgical resident training, namely the 80-hour per week resident work hour limit instituted from the Accreditation Council for Graduate Medical Education (ACGME), have limited training time allowed to become such a master. Because residents must now spend less time in the hospital while obtaining their education in the same number of years, training programs are forced to try to adopt more efficient methods of educating their residents. Third, higher demands for quality and safety and higher expectations from patients and families exist, leading to a need for increased supervision of those in training until they have mastered such techniques.

ethical, and legal issues related to animal models, and poor ability of cadavers to simulate hemorrhage during surgery. Importantly, they also provide us with a way to sensitively quantitate specific aspects of technical surgery including individual hand force generated on tissues, time to complete a task, blood loss, and volume of tumor missed, volume of normal tissue injured, and procedure-specific surgical errors. They can mimic some aspects of neurosurgery that cannot be simulated in cadaver and human models, such as bleeding, operating through traditional corridors and different tactile properties of tissues that are changed with the tissue preservation and fixation process. The NeuroTouch (National Research Council of Canada) is a VR simulator that uses stereovision and bimanual tools that handle with force feedback. Surgical tools utilized by this VR simulator include suction, cavitron ultrasonic surgical aspirator (CUSA), bipolar, endoscopy, and micro-scissors.
We hypothesized that sleep deprivation, caffeine intake, and alcohol consumption the night prior to simulated neurosurgery, may impact performance on the NeuroTouch. We tested this hypothesis in neurosurgical trainees at our institution.

Study design
The study design, methods, and procedures were approved by the institutional review board at the University of Minnesota. Trainees and faculty from the University of Minnesota Department of Neurosurgery practiced a simulated bimanual arachnoid dissection microsurgical task on the NeuroTouch simulator until it was mastered. This task involves using bipolar forceps in the left hand and micro-scissors in the right hand. They were instructed to continue to practice the task until they received an overall score of 80 out of a possible score of a 100 (indicating mastery of the task). Participants attended a gathering where blood alcohol content (BAC) was measured 20 minutes following consumption of the last alcoholic beverage for the night (AlcoHAWK Slim Digital Breathalyzer Alcohol Detector, Quest Products, Pleasant Prairie, WI). The simulator module was repeated the following morning between the hours of 8 and 10am, to mimic typical operating room starting times.
Participants were also encouraged to repeat the task as many times as possible following sleep deprivation and caffeine intake. This was done on various days at the participant's convenience over the course of three months. A questionnaire (Supplementary Figure) documenting sleep, caffeine, and alcohol intake was completed with each use. In the questionnaire one unit of alcohol was assigned the equivalency of one shot of 80 proof liquor, one bottle of beer, or one glass of wine.
Supplemental Figure: A questionnaire documenting sleep, caffeine, and alcohol intake was completed with each use. In the questionnaire one unit of alcohol was assigned the equivalency of a shot of 80 proof liquor, one bottle of beer, or one glass of wine.

Performance metrics
Performance metrics measured included: task duration, left or right hand excessive force, number of incorrect or correct fibers cut, and overall score. The overall score was determined as follows: if all sixteen white fibers were cut then the participant would receive a 100%; if up to 9 of the red fibers were cut then the participant would have up to 100% subtracted from their score; and if there was excessive force used then up to 50% per hand was subtracted from their score.

Statistical analysis
Student's t-test was performed using Prism software (GraphPad software, Inc.) with a P value of less than 0.05 considered significant.

Participants
Eight surgeons at various levels of training volunteered to participate ( Table 1). The mean age of participants was 33.5 years and the average post-graduate year was 5.75. All participants had various levels of prior experience in use of the simulator as well as actual microneurosurgical operative technique in patients. All participants were right-handed.  Figure 1: Task Duration: There were no differences in task duration for any group. In the Alcohol experiment, group A (4 subjects) had an average blood alcohol level of 0.14 the night before completing the module while group B (4 subjects) ingested no alcohol. In the sleep experiment, group A (3 subjects) had >6 hours of sleep and group B (4 subjects) had <6 hours of sleep. In the caffeine experiment, group A (5 subjects) ingested caffeine within 8 hours of simulator task completion while group B (2 subjects) had none.

Sleep deprivation
The effects of sleep deprivation were mixed. There was no statistical difference in task duration ( Figure 1) or left hand excessive force (Figure 2) in the sleep-deprived versus non-sleepdeprived groups. The group that slept six or more hours versus those that slept less than six hours had significantly higher use of excessive force with the right hand (3.08 ± 0.62s versus 1.19 ± 0.17s respectively, P=0.0012) ( Figure 3). Those that slept less than six hours cut more correct fibers compared to those who slept six or more hours (12.45 ± 0.54 versus 14.86 ± 0.18 respectively, P<0.001) ( Figure 4). The group that slept less than six hours also had higher numbers of incorrect fibers cut compared to those who slept six or more hours (0.98 ± 0.16 versus 1.59 ± 0.21 respectively, P=0.03) ( Figure 5). Those that slept less than six hours completed the task with more overall errors ( Figure 6).

Figure 2:
Left hand >Threshold Force: There were no statistically significant differences in excessive force used by the left hand in any of the groups. In the Alcohol experiment, group A (4 subjects) ingested alcohol the night before completing the module while group B (4 subjects) ingested no alcohol. In the sleep experiment, group A (3 subjects) had >6 hours of sleep and group B (4 subjects) had <6 hours of sleep. In the caffeine experiment, group A (5 subjects) ingested caffeine within 8 hours of simulator task completion while group B (2 subjects) had none.

Caffeine
Caffeine consumption was defined prior to analysis as any caffeine consumed within 12 hours of simulator use. The average caffeine consumed by participants was 2.6 cups of coffee and the average time prior to simulator use 4.1 hours. Participants were asked to report all caffeinated beverages consumed, however, no other sources of caffeine were consumed other than coffee. Those participants who reported consuming greater than one alcohol beverage were excluded from this section of the study. The average amount of sleep in the group with caffeine consumption was 4.9 hours compared to the group who had not consumed any caffeine, which was 5 hours.
There was no statistical difference in task duration ( Figure 1) or left hand excessive force ( Figure 2) in the caffeinated versus non-caffeinated groups. However the use of excessive force with the right hand was higher in the group which did not consume caffeine compared to group which did (3.04 ± 0.65s versus 1.27 ± 0.16s respectively, P=0.003 ( Figure 3). The caffeinated group also cut significantly higher numbers of fibers, both correct and incorrect (Figures 4 & 5). Again the caffeinated group completed the task but with more errors (Figure 6).

Figure 3:
Right hand > Threshold Force: There was no statistical difference in the use of excessive force for the alcohol groups. Group A (4 subjects) ingested alcohol the night before completing the module while group B (4 subjects) ingested no alcohol. In the sleep deprivation experiment the group that had >6 hours of sleep (Group A, 3 subjects) used more excessive force than those who slept <6 hours (Group B, 4 subjects). For the caffeine experiment the group who did not ingest any caffeine (Group B, 2 subjects) had significantly higher use of excessive force than those who ingested caffeine (Group A, 5 subjects).

Figure 4:
Number of Correct Fibers Cut: There was no difference in the alcohol groups. Group A (4 subjects) ingested alcohol the night before completing the module while group B (4 subjects) ingested no alcohol. The sleep deprived group (Group B, 4 subjects) cut significantly more correct fibers than the group, which slept six or more hours (Group A, 3 subjects). The caffeinated group (Group A, 5 subjects) cut significantly more correct fibers than the noncaffeinated group (Group B, 2 subjects).

Alcohol
Four of the eight participants who participated in the study consumed alcohol, and the others served as controls. All those that consumed alcohol and agreed to participate in the study where their blood alcohol concentration was tested via breathalyzer were included. The caffeine intake and hours of sleep the participants had prior to the study was self-reported. The alcohol-consuming group had an average amount of sleep of 4.25 hours versus 4.75 hours in those that did not consume alcohol. The average caffeine consumed was 0.25 cups of coffee versus 2 cups of coffee in the 12 hours prior to simulator usage.
In the participants who consumed alcohol, the average BAC was 0.14 and 0.005 the night prior and morning of the task respectively. There was no statistical difference in any measures in the alcohol versus non-alcohol groups (Figures 1-6). There was no significant difference in the number of incorrect fibers cut in the alcohol experiment. Group A (4 subjects) ingested alcohol the night before completing the module, group B (4 subjects) ingested no alcohol. The sleep-deprived group (Group B, 4 subjects) had significantly higher number of incorrect fibers cut than those who slept 6 or more hours (Group A, 3 subjects). The caffeinated group (Group A, 5 subjects) cut significantly more incorrect fibers than the non-caffeinated group (Group B, 2 subjects).

Figure 6:
Overall Performance: There was no significant difference in overall performance in the alcohol experiment. Group A (4 subjects) ingested alcohol the night before completing the module while group B (4 subjects) ingested no alcohol. The sleep-deprived group (Group B, 4 subjects) had significantly better overall performance than those who slept 6 or more hours (Group A, 3 subjects). The caffeinated group (Group A, 5 subjects) also had significantly better overall performance than the non-caffeinated group (Group B, 2 subjects) as determined by the simulator.

Discussion
In educating future neurosurgeons, our goal is to provide them with the knowledge and technical skills needed to provide safe and high quality care to their forthcoming patients. The provision of safe and high quality neurosurgical care is, in some ways, analogous to the performance of an athlete or airline pilot. In the world of sports, athletes approach training and competition with dietary, drug, and sleep restrictions. Commercial pilots perform a task, which can impact the lives of others and accordingly have strict guidelines in terms of their work hours and alcohol intake. Surgeons practice a technical skill daily, which can impact other's lives but have no specific guidelines on sleep requirements, alcohol, medication or caffeine intake, all of which may potentially affect performance.
We hypothesized that sleep deprivation, caffeine, and alcohol consumption the night prior to simulated microneurosurgery may impact performance on a virtual reality neurosurgical simulator. We tested this hypothesis in a group of trainees and faculty from our neurosurgical program using a microneurosugery arachnoid dissection module on the NeuroTouch VR simulator. This small study suggests there may be effects of caffeine and sleep deprivation on our surgical performance. With our small number of participants, we were not able to detect a "hang-over" effect.

Sleep
Concerns that sleep deprivation and fatigue impact physician performance have transformed residency training with the implementation of duty hour regulations There have been various studies looking at sleep deprivation and its effects with mixed results [2]. A study reviewing anonymous surveys by internal medicine physicians reported that 41% of mistakes made were attributed to fatigue. In this study over ninety percent of the mistakes were considered significant adverse outcomes by the house officers [3]. A matched, retrospective cohort study with surgical and obstetrical procedures found higher complication rates in patients where physicians had less than six hours of sleep [4]. A systematic review comparing sleep deprived and non-sleep deprived surgeons showed higher morbidity and mortality in surgical non cardiothoracic surgical procedures with sleep deprivation [5].
There are, however, also several studies showing no adverse effects of sleep deprivation. Using a laparoscopic simulator, Uchal et al. [6] found no significant difference in surgical performance between surgeons who had 1.5 hours of sleep in the past twenty four hours and those who had an average of 6.5 hours of sleep [6]. Ellman et al. [7] conducted a retrospective study in cardiac surgical procedures performed by sleep-deprived surgeons and found no difference in complication rates in procedures conducted by sleepdeprived versus non sleep-deprived surgeons [7]. A retrospective, matched-cohort study with 38,978 patients found no significant difference in death, readmission, or complication for elective procedures whether or not the attending physician performing surgery had provided medical services the previous night [8]. A recent study by our group showed after the implementation of duty hour regulations for residents there has been no change in neurosurgical morbidity or mortality [9].
The definition of sleep deprivation varies widely between studies [5,7,10]. Some report deprivation if the physician reports that they work the night before the surgical procedure and some report it as hours slept. The six hour cut off for sleep deprivation is a number chosen after review of Chu et al.'s [11] review of 4047 consecutive cardiac procedures and their definition of sleep deprivation as sleep less than six hours [11].
In our study, participants who slept less than six hours completed the task with more overall errors compared to those who slept six or more hours. However, analysis of other metrics showed mixed results. There was no difference in task duration or left hand excessive force in the sleep-deprived versus non sleepdeprived groups and the group that slept six or more hours had significantly higher use of right hand excessive force compared to the sleep deprived group, suggesting that sleep deprivation may actually decrease use of excessive force, an action thought to be potentially harmful to neural tissue.
One weakness of our study is that there is no differentiation between residents and attending surgeons. Gerdes et al. [10] showed that on a laparoscopic simulator, attending physicians make 25% fewer cognitive errors than residents when sleep deprived [10]. Our results supports that sleep deprived surgeons may make more errors.

Caffeine
Little is known about the effects of caffeine on surgical performance. In studies with athletes, 5-6mg/kg of caffeine before training or competition improved motor skill and cognitive performance [12]. Lower doses near 3mg/kg have been efficacious for endurance capacity and cognition in military populations [13]. Microsurgeons often discuss how they believe caffeine effects their tremor but there is no literature within neurosurgery regarding this commonly held belief. Our study does not support this perception. In line with this, a study of seventeen ophthalmologic surgeons after ingestion of 200mg of caffeine or placebo found no statistical difference in the effect on tremor as evaluated by observers [14].
In our study, caffeinated participants received overall higher scores on the simulator because they were able to complete the task with using less excessive force and cut more correct fibers. However they did so with more mistakes. The participants who consumed caffeine prior to the simulated task had more overall errors than those who did not consume caffeine. The non-caffeinated group displayed more right hand excessive force, and cut significantly less numbers of correct and incorrect fibers.
Although the simulator gives higher overall scores for those who drink caffeine, it has no way to measure fine tremor, which is an outcome measure that may be affected. Daily caffeine drinkers may perform worse without their normal dose of caffeine and this was also not controlled for in our study.
An area that also has been poorly studied is the effect of sleep deprivation and caffeine consumption combined. Our study also did not look at this interaction. A study with medical students showed that sleep deprivation decreased performance metrics on a laparoscopic simulator but then the metrics improved to baseline when caffeine was consumed [15].
alcohol consumption the evening prior to neurosurgery on surgical performance. The "hangover effect" is thought to decrease memory, psychomotor vigilance, and fine motor dexterity [16,17]. This knowledge is what had led to the Federal Aviation Administration "bottle-to-throttle" rule where pilots cannot fly a civilian aircraft within 8 hours after the consumption of any alcoholic beverage [18].
A study of the hangover effect on 27 surgeons using a laparoscopic simulator found that in one of three tasks the surgeons were less accurate the morning after drinking [19]. A study looking at simulator performance with laparoscopic novices as well as surgeons found that the group which drank alcohol the night before had impaired performance worse than baseline, worst in the morning but still present even at 4:00pm [20]. Kocher et al. evaluated performance the morning after and found worsened performance metrics in the group that had consumed alcohol the evening prior. There also have been studies showing there is no effect. Dorafshar et al. [21] reported that surgical performance was impaired immediately after moderate alcohol consumption but this impairment was not observed the morning after.
In our study, there was no statistical difference in any of the performance metrics in the group of participants who consumed alcohol the evening prior to simulated microneurosurgery versus those who did not, despite a relatively high BAC in the group who consumed alcohol (0.14). It is quite possible that alcohol consumption does affect neurosurgical operative skills however the simulator is not sensitive enough to pick up these small differences. Additionally, only four subjects consumed alcohol in this portion of the study. Moreover, there was no cognitive aspect to our simulated neurosurgical task. Simulator studies with pilots which show poor performance from the hangover effect involve pilots not only use flight simulators but incorporating an event requiring them to use unusual emergency procedures [22]. Our simulator does not have that capability to test response to unusual situations and this may be an area where there are differences in performance between groups.

Limitations and Future Directions
Our study suggests there may be effects of caffeine and sleep on our surgical performance, while we were unable to detect a "hangover" influence. We acknowledge our study has several limitations. We had a relatively low number of participants in this study. This is a common problem in single-institutional neurosurgery training programs due to a relatively small size of our specialty. We considered including medical students and other types of residents, but wanted to avoid other bias that could occur due to inclusion of non-neurosurgical specialties. Another approach to get around this could be to perform a multi-center study but hurdles exist here too because limited training programs use the same type of VR simulator. Also, our task was relatively short (can be completed in minutes) so we were unable to study the effect on surgical fatigue that occurs with long operations, which could very well be effected by sleep deprivation, caffeine intake, and alcohol consumption the evening prior to surgery. We chose a relatively short task due to current time demands on neurosurgical trainees making it difficult for them to add additional several hours of time performing simulated surgery, however, in future studies we may consider this option.
Our simulated task also lacks a cognitive component that requires memory and problem solving, which can also be affected by sleep deprivation, caffeine, and a hangover. A longer procedure may also better bring out the effects of these potential influences on surgical performance. Lastly, we acknowledge that simulated microneurosurgery differs substantially from real-life neurosurgery and the simulator may not be able to predict operative performance. Additionally, the overall score generated by the simulator may not accurately reflect the most important performance metrics. For example, in the simulator the cutting of an incorrect fiber is given a negative point value to be subtracted from an overall score however in a neurosurgical procedure the cutting of one incorrect fiber or artery may negate all the positive work done in an entire procedure. In another example, our results showed that caffeinated participants achieved overall higher scores on the simulator but with more mistakes. The overall score therefore calculated by the simulator may not be as important as the individual metrics measured. The excessive force measurement is also based on the simulator assessment of force and not what may be clinically relevant and important excessive force in a surgical procedure.

Conclusion
This small study suggests there may be effects of caffeine and sleep on our surgical performance. If our tendency to make errors is truly higher with lack of sleep or more caffeine then further work needs to be done in this area so we can be aware of and control for these variables to provide the safest care possible. With the improvements in simulation this can be studied further using much larger groups and various controls.