Direct Reproduction of the Iowa Gambling Task and the Replication Crisis in Psychology
The reproducibility of psychological experiments is crucial in proving the reliability of scientific findings and results of the experiments. There has been a replication crisis for a great number of psychological studies cannot be successfully replicated or does not include all the information needed for a direct replication to happen. The aim of this study was to attempt to directly replicate the Iowa Gambling Task designed by Bechara and colleagues, 1994 use and thus discuss the replication crisis in the psychological research field. Participants were 217 university students, who completed a replicated computer-based simulator version of the Iowa Gambling Task, which requires them to choose between more advantageous card decks and disadvantageous card decks. A similar result to the original finding was obtained from this study and the replication was successful, proving the findings discovered in the original experiment. This report will be further discussing the importance of direct replication as well as reasons and possible solutions for the replication crisis.
Replication, which is the reproducing of an existing study. Obtaining similiter outcomes from a repeated study is an essential feature of scientific researches (Zwaan et al. 2017). Replication determines whether a finding is a single observation or an evidential scientific discovery (Zwaan et al. 2017). It was also pointed out by Zwaan et al. (2017) that “a scientific discovery requires both a consistent effect and a comprehensive description of the procedure used to produce that result in the first place.” In order to allow the others to be able to follow the same procedure and replicate the experiment, the report of the research should also include a detailed methodology. Alarms towards replications have arisen since the research data of a hundred replication experiments have been published in journal Science in 2015. A hundred experiments were done to determine the reproducibility of some of the most well-known experiments. Statistically significant p values below .05 were reported from 97% of the original studies. However, only 36% of the replication studies found statistically significant results (p < .05) (Open Science Collaboration, 2015).
There can be many reasons for non-replication. For example, fake results, small sample sizes or the result not being universal (Diener, 2016). The publication barrier can be another cause of the replication crisis (Yong, 2012). Positive results in psychology are much easier to be published as they present new, exciting research, while the replications with negative results do not get the chance to be seen. Furthermore, reproducibility of experimental findings is also hugely limited by the fact that the raw data from the experiments are not included in the publication, and the methodology that shows the procedure of experiments are not published as well. (Knorr-Cetina, 1981). Therefore, the concept of Open Science is significant. There are 6 main aspects to open science. Open access to make research results available. Open data to publish the raw data. Open source to give access to research prototypes. Open methodology to show the detailed procedure. (P. Kraker et al. 2011). Also, open peer review and open educational resources (Diener, 2016).
Moreover, replications can be divided into conceptual (constructive) and direct (operational) replication. Zwaan et al. argued strongly that direct replication is the essence of the future improvement of psychological as a whole (Lilienfeld, 2018). Direct replication is when a scientist attempts to follow the exact same procedure, under the same condition, try to reproduce an earlier study to determine whether the same results can be obtained (Diener, 2016). Direct replication is essential as successful conceptual replication does not necessarily guarantee a successful direct replication. (Lilienfeld, 2018).
In addition, Simons et al. (2018) suggested that the definition of direct replication is yet to be clarified as when a replication result and the original differs, any changes in the procedure can become possible explanations for the inconsistency; whether the replication counts as “direct” entirely depends on how those variances in procedures are interpreted. There were also different opinions on this issue including one that suggested that direct replication should not be elevated over conceptual replication and checking of statistical assumptions. Directly reproducing a study with incorrect or flawed analyses and statistical assumptions could result in having greater confidence in the inaccurate findings. Determine the accuracy of statistical assumptions may be more important than direct replications. (Heit & Rotello, 2018). Heit & Rotello also believed that failed replications are just as important as scientists can learn something valuable from both successful and failed results.
Bechara et al (1994) conducted the Iowa Gambling Task experiment. They concluded that patients with damage to the ventromedial sector of prefrontal cortices develop a severe impairment in real-life decision-making. In the experiment, the target group was required to participant in a gambling task which simulates real-life decision-making with a control group of healthy people. The task was to pick cards from four decks in which two of them were considered advantageous and the other two were considered disadvantageous. It was shown in the result that the healthy control group tend to choose the advantageous group of cards more than the disadvantageous ones. The aim of this study was to replicate the Iowa Gambling Task. This was done by testing the hypothesis that the result of the healthy control group in the Iowa Gambling Task can be directly replicated. The importance of direct replication was also discussed further.
217 participants (191 female, 26 male) were recruited using opportunistic and volunteer sample. All the participants were undergraduate psychology students at the University of Leeds, with a mean age of 19.47 and a standard deviation of 1.17. 208 of the participants were right-handed whereas 9 of the participants were left-handed. All the participants were informed about the right to withdraw at any time, as well as the right to decline to answer any question during the process. Informed consent has been given by all the participants to the research team to access the responses and use the data collected in the research. The study received approval from the local ethics committee.
The hypothesis was tested using repeated measures design as the participants repeated the choice making trial 100 times under the same condition (see Material and Procedure). The independent variables of the design were the categories of decks of cards which had 4 nominal measurement levels (deck A, B, C, and D). The dependent variables were the frequencies of the card selection with is on the interval level. All the participants were asked to complete the experiment in silence (exam conditions) to prevent bias.
Materials and Procedure
A direct replication of the Iowa Gambling Task, originally designed by Bechara and colleagues in 1994 was completed by all the participants. A computer-based simulator was used instead of real cards as in the original task (see Appendix A). 4 decks of cards (A, B, C, and D) were presented to the participants, the participants were required to choose 100 cards in total, one at a time. Each time a card is picked, a feedback about winning and/or losing money will be displayed. Participants were not informed about what each card would yield beforehand. Participants started with a "loan" of 2000 dollars and were told to make a profit. Decks A and B would always give a profit of 100 dollars while Decks C and D would always give 50 dollars. However, for each card chosen, there was also a 50% chance of having to pay a penalty. For decks A and B, a 250 dollars penalty would be taken, while for decks C and D it was 50 dollars. Each participant was given a survey about their date of birth, gender and handedness, and asked to sign a consent form (see Appendix B) before they turned in the data.
The aim of the study was to test whether the direct replication of the Iowa Gambling Task could give the same result as the original experiment. Ask shown in figure 1, the mean estimated selection frequencies of deck C and D was visibly higher than deck A and B.
The 95% confidence limits for deck C and D and deck A and B did not overlap (see figure 1), therefore we could be 95% confident that the mean estimation of the selection frequencies for Deck C and D would differ from that of the deck A and B in a wider population. Therefore, a reliable result is given.
Figure 1. Selection frequency scores of card deck A, B, C, and D. Error bar represents the 95% CI.
The aim of this study was to test whether the direct replication of the Iowa Gambling Task could obtain the same result as the original. The results obtained from the experiment supported this hypothesis. The selection frequencies of the deck C and D (advantageous decks) is significantly higher than that of the deck A and B (disadvantageous decks) just like the results from the healthy control group in the original study. It proved that normal people have the ability to evaluate which decks are riskier and which are safer in the long run. Thus, shows that they are able to make more advantageous decisions in life.
Though the replication was successful, there were limitations to the study. First of all, although the replication could be seen as a direct replication, there were still differences in the methodology between the current study and the original such as the original experiment used real card decks while the replication used computer simulated cards. The exact definition of direct replication is yet to be cleared. Secondly, although the sample size was not too small (217 participants), the sample could be considered as not representative since the male and female ratio was not well balanced. In addition, the age range was also very limited as all the participants were first-year undergraduate students.
This direct replication was successful as the same results were obtained. However, as discussed before, there are still a huge amount of experiments and findings that cannot be replicated due to different kinds of reasons. For the experimenters, it is crucial to pay close attention to sample sizes and whether the sample group is representative enough. When the sample group is being chosen, elements such as social environment, gender, the age of the participants and other factors that might affect the result should all be taken into consideration. Moreover, it is also important to give access to the raw data, detailed methodology, source as well as open peer review to allow others to replicate and test the experimental findings. For the media, it is important to allow the publication of replications especially the ones with negative results. In conclusion, it requires work from all aspects to allow psychological studies to improve and become more authentic.
Finally, it is also significant to pay attention to the conceptual replication and checking of statistical assumptions apart from direct replication as each shows a different perspective and explanations for whether a study can be successfully replicated. The failed replications should not be overlooked as well, for valuable findings and experiences can be obtained from the negative results as well.