Fraud in Science and Parapsychology

book cover of Betrayers of the Truth: Fraud and Deceit in the Halls of Science, by William J Broad and Nicholas Wade (1983)

It is often claimed that positive experimental findings with regard to psi phenomena may be attributed to rampant fraud on the part of researchers. In this article, British parapsychologist Chris Roe considers the true incidence of fraud in the field of parapsychology and how it compares with other scientific disciplines.

Chris Roe is professor of psychology at the University of Northampton and a former president of the Society for Psychical Research.

Contents

Background
Fraud in Parapsychology
1. Walter J Levy
2. Samuel G Soal
Fraud in Other Disciplines
Fraud in Psychology
1. Diederik Stapel
Surveys
Whistleblowers
Conclusions
Literature
Endnotes

Background

Parapsychology is portrayed in mainstream academic textbooks as characterized by experimenter fraud, with the implication that many positive outcomes can be accounted for in terms of malpractice. For example, Gross quotes Colman1 as describing the history of parapsychology as ‘disfigured by numerous cases of fraud involving some of the most ‘highly respected scientists’.2 Such damning appraisals are made not only by hard-line sceptics, but occasionally also by commentators who contribute to the psi research literature. Douglas Stokes3 has claimed that the body of evidence from parapsychological research ‘conform[s] to the pattern that would be expected if a small minority of psi researchers has engaged in fraud’.4 For James Kennedy the situation is even worse: ‘Experimenter misconduct has occurred many times in parapsychology and is a constant threat. It detracts from the scientific acceptance of the field and hinders progress by diverting resources to invalid hypotheses’.5

Fraud in Parapsychology

To substantiate his assertion, Kennedy refers to 17 cases of fraud in parapsychology,6 of which 15 are derived from JB Rhine’s paper, ‘Security Versus Deception in Parapsychology’.7 In that article Rhine does indeed refer to ‘a dozen cases to illustrate fairly typically the problem of experimenter unreliability prevalent in the 1940s and 1950s’ – following the popularization of Rhine’s monograph Extrasensory Perception, which described research methods developed at Duke University to enable replication attempts without the need for specialist equipment or extensive training. It seems the monograph’s aim was successful, since these cases sound like reports of experiments received by the Journal of Parapsychology from people who were not academics (‘seven did not have the doctorate’) and not part of the parapsychological community (‘several were persons of evident ability but were located (some of them abroad) where research in parapsychology was extremely hard to manage but not nearly so hard to fake’).

Crucially, there is no indication that any of these persons continued to conduct experiments or that any of their work was published. It seems odd, then, for Kennedy to portray them as if they were typical or representative of the parapsychological community at large.

Rhine does refer to four additional cases involving people who were ‘all better qualified for psi research than ... the ‘dozen’. They all knew the rules and standards that had been developed through the years’8 and these are of more concern. Unfortunately, these cases are all described in general terms, with no information included that might allow those suspected of ‘experimenter unreliability’ to be identified. Their non-adherence to generally accepted security standards is described, though it is not altogether clear that they constitute misconduct. For instance, the first example involves a comparison of performance by participants at psi and non-psi tasks in which two experimenters were responsible for different tests. Rather than ensure that the tests were scored independently while blind to the outcomes from other tests, the experimenters actually exchanged information when participants were scoring particularly well. This could certainly result in an expectancy bias when scoring performance that could inflate any correlation in scores between the tasks, but pales in comparison with the types of fraud discussed later in this article.

Examples 2 and 3 present a stronger circumstantial case for experimenter fraud, in which effects only occurred when one experimenter had unsupervised access to raw data records. Importantly, colleagues raised suspicions about the work and none of it had been published when the researchers withdrew.

Rhine’s final example is a study that adopts the Screened Touch Matching method described by Pratt and Woodruff, because of criticisms by sceptic psychologist CEM Hansel that this allowed fraud by exploiting inadequate matching. But this is not a case of someone being caught committing fraud, merely an instance in which trickery might have been possible.

Taken together, these cases do not support Kennedy’s assertion that experimenter misconduct is commonplace in parapsychology.

Walter J Levy

Of more concern are two particular cases to which Kennedy9 and Stokes10 also refer, which are generally accepted as involving calculated and systematic fraud by the experimenter. The first involves Walter J Levy Jr, a medical school graduate who had joined Rhine’s Institute for Parapsychology and was prolific and highly regarded, so much so that he had been appointed director of the Institute and was expected to succeed Rhine when he shortly retired.11 Levy was particularly interested in animal psi: he had designed ingenious experiments that tested psi abilities in gerbils, rats and chick embryos, such that successful psi performance would meet the animals’ basic needs (avoidance of pain, increase in experiences of pleasure, maintenance of optimal body temperature).

A strength of Levy’s experiments was their automaticity — once set up, the apparatus could run independently, monitoring the behaviour of random event generators (REGs) that the animals needed to influence in order to produce the desired outcome, and creating a physical recording of the outcomes for analysis. Three of Levy’s research colleagues (Kennedy, Jim Davis and Jerry Levin) became suspicious, then, of the amount of time Levy seemed to spend in the vicinity of the equipment while the experiments were in progress. They secretly wired up the computer so that it would produce a second record of REG output, and to their consternation discovered that this showed a perfectly random output, while Levy’s official record showed a deviation in the predicted direction (that is, the rats were getting more stimulation of their neurological pleasure centres than would be expected by chance). Rhine was presented with the evidence and confronted Levy, who admitted having falsified the confirmatory study but insisted that his original research was sound, and the data falsification had only begun when the genuine effect could not be repeated.12 This line of research had not yet been published (because of the need, in Rhine’s view, for replication in order to confirm findings as evidential). Levy defended his other, published, work, pointing out that it had been independently replicated both at the Institute for Parapsychology and elsewhere. Nevertheless, in writing about the affair in the next issue of the Journal of Parapsychology, Rhine sagely advised ‘although his single known violation involves only one of his many experimental lines, it unavoidably casts a reasonable doubt on all of his work individually and jointly conducted during the five years he has been in parapsychology’.13

Rhine took comfort from the principle that, in the long run, independent replication would differentiate between sound and unsound findings. He immediately wrote to all those he believed were planning to refer to Levy’s work in designing and writing up their own studies to inform them that the data were suspect, ensuring that this instance of fraud was dealt with swiftly and publicly. Fifteen months later Rhine14 gave an update which identified other suspicious practices by Levy and led him to conclude rather reluctantly that despite double blind and multi-experimenter designs in some cases, no study completely eliminated the possibility of dishonesty, and so there was no option but to write it all off.

Samuel G Soal

The second generally accepted case involves Samuel G Soal, a mathematics lecturer at Queen Mary College, part of the University of London. Soal had been one of the principal exponents of forced choice ESP testing in the UK, but the consistent failure of these tests had led to him become one of Rhine’s severest critics.15 That is, until (on the advice of Whately Carington) he reanalysed his data to look for displacement effects, instances in which the participant’s call corresponded not with the target symbol but with the preceding or subsequent symbol. Two participants who showed evidence of these effects, Basil Shackleton and Gloria Stewart, were invited back for further tests, and these resulted in a steady stream of significant results.

Soal was a mathematician by profession but achieved his DSc for psychical research. Although his reputation remained intact during his lifetime, suspicions had been raised, but these were quickly put down by threats of legal action.16 In a detailed account, Betty Markwick and Donald West 17 have described testimony by one of the accusers, Gretl Albert, who was one of the successful partners in experiments carried out with Shackleton. Albert confided to her friend Mollie Goldney (Soal’s co-experimenter) that after the test she had seen him altering figures on a score sheet (the records were in the form of the numbers 1 to 5 rather than as symbols). Goldney asked to see the score sheets, but could find no signs of alteration. The nature of the alterations was not obvious from simple scrutiny with a magnifying glass; it required statistical analysis. Scott and Haskell18 speculated that if Soal was converting target 1s into 4s or 5s so that they matched participant calls, then there should be an excess of hits on trials where the target was apparently a 4 and 5 and a deficit of target 1s where the guess was 4 or 5. Both effects were confirmed so dramatically as to suggest that most target 1s that fell opposite a guess 4 or 5 had been altered to produce false hits. However, there was no overall deficit of target 1s or excess of target 4s and 5s: Barrington19 interpreted this to mean that the target sheets had been prepared with manipulation in mind, by beginning with too many 1s and too few 4s or 5s. Scott and Haskell acknowledged that their findings could not explain the above chance scoring in most of Soal’s sessions with Shackleton, but felt it was more likely that other falsification methods had been used than merely mixing the genuine and the fraudulent.

Evidence of those other methods came from Markwick,20 who found that Soal’s pre-prepared lists of random numbers (supposedly drawn from published sources) contained repetitions of sequences of up to 25 digits at a time, sometimes in reverse order. These repeated sequences contained an occasional extra digit, and 75% of these gave hits, which could suggest that placeholder digits (for instance, ‘1’) had been entered with the intention of adjusting them later to match the call.

Despite the general suspicions surrounding Soal’s reported levels of success (according to Stokes,21 Rhine had long suspected that his research was fraudulent), other aspects of his behaviour are puzzling. Markwick and West22 conclude, ‘our revelation of his further deceptions may boost some sceptics’ assumptions that all seemingly convincing claims for the paranormal must be fraudulent. That is not our view. Soal was exceptional in his secretiveness, his resistance to outside interference and his unwillingness to have his subjects tested by other experimenters.’

Fraud in Other Disciplines

How does this portrait of fraud in parapsychology compare with fraud in other sciences? Rhine himself felt that parapsychology lagged behind, commenting that ‘most other branches of science have already matured to the point where the problem of experimenter trickery causes no great concern. This is partly because deliberate fraud would be too quickly spotted and exposed’23

Such confidence in the checks and balances of mainstream scientific practice has been typical. Braud and Wade24 cite testimony given by Philip Handler, president of the National Academy of Sciences, before the US House Committee on Science and Technology, who described the problem of scientific fraud as ‘grossly exaggerated’ by the press, and even when it does transpire ‘occurs within a system that operates in an effective democratic and self correcting mode’.

But that system has proven to be imperfect and the collection of cases of scientific misconduct has grown steadily. So much so that the National Science Foundation now differentiates between three types: fabrication, falsification, and plagiarism.25 Plagiarism is the appropriation of another person’s ideas, processes, results, or words without giving appropriate credit. Falsification is manipulating research materials, equipment, or processes, or changing data or results. Fabrication is making up data or results. To illustrate these three types, and to evidence their pervasive occurrence, I shall briefly refer to major figures in the history of science who are now generally accepted as having engaged in fraud – the list is far from exhaustive.26

Claudius Ptolemy

Perhaps the earliest established case of plagiarism is that of Claudius Ptolemy, whose Almagest presents astronomical observations that provide the basis for a mathematical model to describe the movements of celestial bodies around the Earth, and was hugely influential until superseded by the heliocentric model proposed by Copernicus and others. Ptolemy claimed to have made these observations himself in Alexandria, Egypt. However, later back-calculations from the planets’ current positions suggested that many of these observations were very poor even by the standard of the day and accorded much more closely with what Ptolemy’s predecessor Hipparchus could have observed from Rhodes some 278 years earlier.27 Ptolemy’s observations include curious omissions —of the 1,025 stars he documented, none are from the five degree band visible from Alexandria but invisible from Rhodes. As Grant28 summarizes, ‘rather than go out and make observations, it seems Ptolemy spent his time in the Library at Alexandria cribbing many of Hipparchus’s results and claiming them as his own’.

Isaac Newton

Isaac Newton may be the most eminent person to be accused of falsification. He was an irascible personality who regularly became involved in spats with contemporaries. His greatest adversary was probably Leibniz, whose natural philosophy was at odds with Newton’s theory of gravitation and laws of motion as outlined in his Philosophiae Naturalis Principia Mathematica. Leibniz’s influence in continental Europe meant that Newton’s theory met with a lukewarm response there. In subsequent editions, Newton’s case was strengthened by the adjustment of data, including his calculations of the velocity of sound, the precession of the equinoxes and the measurement of tides, so that they agreed precisely with his theory.29 Newton’s reported measurements are given to six significant digits, a level of precision that is almost impossible even today.30

Gregor Mendel

The Augustinian friar, Gregor Mendel, is credited with making observations of the inheritance of characteristics across generations of pea plants in proportions that suggested a kind of discrete transmission, so laying the foundation for a science of genetics. However, the proportions that Mendel reported fit so exactly with theoretical expectation that they drew the suspicion of RA Fisher, the eminent statistician who was responsible inter alia for the analysis of variance (ANOVA) test. Fisher closely examined Mendel’s methods and data and found the data were too good to be true, rather kindly concluding that Mendel’s assistants may have adjusted figures in line with expectation. Others have suggested subconscious errors or selection31 or, less kindly, fudging.32

Galileo

Galileo Galilei is now thought to have fabricated the results of many of his experiments, which is somewhat ironic for the exemplary empiricist who privileged experimental observations over aesthetic or theoretical concerns. Tales of his testing the action of gravity by dropping objects from the Leaning Tower of Pisa are regarded as apocryphal.33 Other experiments that he claimed to have repeated ‘near a hundred times’ with consistent results, could not have given that degree of homogeneity using the materials available at the time — as was found by contemporaries such as Pére Mersenne, who failed to replicate his findings.34

Fraud in Psychology

Diederik Stapel

One high-profile and extensive case of fraud within psychology involves Diederik Stapel, professor of social psychology and dean of the School of Social and Behavioural Sciences at Tilburg University. Stapel had enjoyed a prolific career: he received €2.2 million in grants from the Netherlands Organisation for Scientific Research35 and published 124 journal articles that have been cited a total of 1,756 times.36 One such article, published in the flagship journal Science, reported an experiment conducted at the train station in Utrecht showing that a rubbish-filled environment tended to bring out racist tendencies in individuals.37 The cleaning staff had been on strike just before the summer vacation: this provided an opportunity to compare responses from people who visited the station during the strike, when the platform was unclean, with those who were there once the cleaners had returned. He had speculated that expressed attitudes would be more stereotypical (‘Brazilians are sexy, British people are polite, New Yorkers are pushy’) in the former, less orderly environment.

Cleverly, he also designed a behavioural measure by inviting the predominantly white respondents to sit down on one of six seats laid out in a row while they completed the measure, which happened to have a black person seated at one end, apparently already participating in the study. Stapel reported that participants did indeed express more stereotypical views in the untidy condition, and interestingly sat farther away from the already seated black person when their environment was more messy, indirectly suggesting heightened racism.

The problem with these elegant findings is that Stapel was so convinced that this phenomenon was a true property of the real world that he didn’t think it necessary to actually collect the data, which he instead invented at home so that they would give a cleaner result. He claimed that this was a consequence of his early experience with journal editors who found his real experimental data too complicated, with relationships that were too messy, often asking him to leave out elements and make things simpler before they would publish — it was simpler to ensure the data were neat and consistent by making them up.38

Soon he was embroiled in an elaborate charade in which he collaborated with research assistants or PhD students on the development of research materials, such as questionnaires and bespoke equipment, but then (astonishingly) insisted on conducting the studies alone or giving the materials to his 'contacts' in schools and colleges to administer. This allowed him to simulate the experiment at home to give a reasonable benchmark score and then create datasets around that figure that would give an unambiguous but believable confirmation of his hypothesis. He brought the data sets or results of analyses back to colleagues and collaborated with them on the write-up.,

Stapel may have got away with his fabrications for so long because he ensured that his findings were in keeping with general expectations. ‘I didn’t do strange stuff, I never said let’s do an experiment to show that the earth is flat … I always checked … that the experiment was reasonable, that it followed from the research that had come before, that it was just this extra step that everybody was waiting for’.39

Ultimately his fraud was revealed when his collaborators asked about possible internal analyses (such as sex differences) that hadn’t occurred to Stapel, and so hadn’t been concocted; or they asked for the raw data to conduct exploratory analyses but were told that they had been destroyed for lack of storage space. Suspicions were also raised by Stapel’s near faultless record of significance. One colleague was struck by how great the data looked, no matter what the experiment: ‘I don’t know that I ever saw that a study failed, which is highly unusual ... Even the best people, in my experience, have studies that fail constantly. Usually, half don’t work’.40 An investigation by the universities that had employed him found Stapel had committed fraud in at least 55 of his papers, as well as in 10 PhD dissertations written by his students.41 By 2014, 58 of his published papers had been retracted.42

While Stapel may be labelled ‘perhaps the biggest con man in academic science’43 he is far from unique among psychologists. Well-documented cases include that of Sir Cyril Burt, who invented data sets (and co-authors) to support claims about the role of genetic inheritance on personality and intelligence;44 Marc Hauser, who fabricated data and pressured graduate students to reach his preferred conclusions;45 Dirk Smeesters, who saw two high-profile papers retracted when their data were found to be too good to be true;46 and Karen Ruggiero, who admitted to fabricating five experiments published in two articles and to doctoring research that appeared in a third.47

Such detailed exposés can create the impression of being highly unusual aberrations perpetrated by individuals who can safely be regarded as pathological, perhaps made so by stress or overwork. But taking this view could simply be a defensive strategy, one that allows us to distance the miscreant from normal researchers and distinguishes their behaviour from normal practice. According to Braud and Wade,48 the genesis of fraud often involves smaller steps: ‘those who falsify scientific data probably start and succeed with the much lesser crime of improving upon existing results. Minor and seemingly trivial instances of data manipulation — such as making results appear just a little crisper or more definitive than they really are, or selecting just the “best” data for publication and ignoring those that don’t fit the case — are probably far from unusual in science. But it is only a difference in degree between “cooking” the data and inventing a whole experiment out of thin air’.

This resonates with Stapel’s own account. In his biography, Faking Science: A True Story of Academic Fraud, he begins, ‘I was doing fine, but then I became impatient, overambitious, reckless. I wanted to go faster and better and higher and smarter, all the time. I thought it would help if I just took this one tiny little shortcut, but then I found myself more and more often in completely the wrong lane, and in the end I wasn’t even on the road at all. I left the road where I should have gone straight on, and made my own, spectacular, destructive, fatal accident.’49

Surveys

Recent findings from psychology suggest that these more minor transgressions are relatively common. John, Loewenstein and Prelec50 surveyed over 2,000 psychologists about their involvement in dubious research practices. Worryingly, they found that some occurred quite regularly. These include not just those that are questionable – such as carrying out multiple analyses with selective reporting (78%) and ‘optional stopping’, terminating an experiment once a significant result has been reached (36%) – but also practices that fall closer to the conscious fraud end of the continuum, such as excluding data once its effects on the analysis are known (62%) and actually falsifying data (9%).

Extending beyond psychology, Fanelli 51 presented a meta-analysis of survey data on scientific misconduct that gave a pooled weighted mean of 1.97% of respondents who admitted to having ever fabricated or falsified research data. This figure rose to 14.12% when respondents were asked if they had personal knowledge of a colleague who altered, fabricated or falsified research data. It increased to 46.24% when misconduct was defined more comprehensively, for example as ‘experimental deficiencies, reporting deficiencies, misrepresentation of data, falsification of data’.52 When the American Association for the Advancement of Science surveyed a random sample of its members, it found that 27% believed they had encountered or witnessed fabricated, falsified or plagiarized research over the previous 10 years, with an average of 2.5 examples.53 Incidences are typically even higher when restricted to the biomedical sciences.54

One indicator of fraud is the rate at which papers are subsequently retracted from journals. Steen, Casadevall and Fang55 found 2,047 retracted articles indexed in PubMed, and that the number had risen sharply in the last decade, both by reason of fraud and of ‘error’ (which included plagiarism). This couldn’t be attributed just to the increase in numbers of publications, but also represented a pro rata increase. (However, Gross56 warns against over-interpreting these figures, since increased retraction rates might also reflect the greater attention being paid to malpractice or greater willingness to bring instances to the attention of editors and publishers.)

Fang, Steen and Casadevall57 followed up by consulting secondary sources to identify the reasons for retraction if none had been mentioned in the retraction notice (as was often the case). They found that 21.3% of retractions were attributable to error while 67.4% were attributable to misconduct, including fraud or suspected fraud (43.4%), duplicate publication (14.2%), and plagiarism (9.8%).

It is not uncommon for retractions to be announced ambiguously and with little fanfare, so that they can go unnoticed. Retracted papers continue to be cited even after their retractions,58 including by 18% of the authors of retracted papers themselves, with fewer than 5% mentioning that the papers were retracted.59

Whistleblowers

Recently, internet sites such as Retraction Watch have been established with the goal of publicizing unsound research. Their influence is difficult to gauge, but they may make it easier for dubious practices to be highlighted. Most detection of scientific misconduct is by laboratory colleagues of the transgressor, including their supervisors and students, who work at sufficiently close quarters to notice oddities in behaviour or data.60 But the consequences for such ‘whistleblowers’ can be severe. Lubalin, Ardini and Matheson61 found that 47 of 68 complainants whom they surveyed suffered at least one negative consequence, such as being pressured to withdraw their allegation, being ostracized by colleagues, or suffering a reduction in research support; 56% believed that whistleblowing stigmatizes the complainant. Retraction sites provide an opportunity for concerns to be raised anonymously, thus protecting the complainant, and can ensure that allegations about particular researchers are more widely discussed and acted upon rather than swept under the carpet.

To illustrate, the blog site Sciencefraud.org published 274 anonymous emails in the period July to December 2012 from bio-scientists claiming research misconduct in studies that had been published. In January 2013, legal threats forced the closure of this site, but a further 233 anonymous emails were submitted that could not be publicized. The fate of the papers referred to in these emails was analysed by Brookes;62 he found no initial differences in the characteristics of the ‘public’ and ‘private’ cases, but the number of retractions was 650% higher for ‘public’ cases, and the rate of publishing corrections or other errata were 770% higher. Overall, some kind of corrective action was taken on 23% of the publicly discussed papers, but this was the case for only 3.1% of the unpublicized papers.

Conclusions

In conclusion, although parapsychology is marred by two well substantiated cases of fraud, these seem typical for the sector rather than a sign of something distinct about the subject area. The Levy case is reminiscent of frauds like the Cornell cancer researcher Mark Spector, who was being groomed to take over as lab director, while Soal’s case seems similar to that of Cyril Burt. Nor, given recent revelations about misconduct across a range of disciplines, does the incidence of fraud in parapsychology seem high compared with the sector generally. It is arguably unwarranted, then, for commentators to imply that parapsychology has a particular problem with experimenter misconduct.

Indeed, there are grounds for thinking that parapsychology is less susceptible to fraud than other research areas. Although there are as yet no systematic empirical studies of the characteristics of perpetrators of scientific misconduct, Gross has described the model fraudulent scientist as 'a bright and ambitious young man working in an elite institution in a rapidly moving and highly competitive branch of modern biology or medicine, where results have important theoretical, clinical, or financial implications. He has been mentored and supported by a senior and respected establishment figure who is often the co-author of many of his papers but may have not been closely involved in the research'.63

And, based on their extensive primary research, Braud and Wade64 conclude that ‘the crime rate in science is influenced by three principal factors: the rewards, the perceived chances of getting caught, and the personal ethical standards of the scientist’. (To this one might add the expected consequences of getting caught, since there are instances in which those who have admitted misconduct have been allowed by an institution to leave discreetly, so as to not tarnish their reputation or embroil them in lengthy and costly legal and administrative proceedings.) In other words, fraud will be more common where it is likely to be lucrative, where one’s research can pass relatively unscrutinized (or even unnoticed), and where if discovered one’s actions are more likely to be dealt with quietly rather than publicly. Arguably, none of these conditions pertain to parapsychology, which is highly under-resourced, is subject to very high levels of scrutiny whenever findings are positive, and which has a track record of public exposure when fraud is discovered.

These are not grounds for complacency, and parapsychologists need to remain proactive and vigilant. Fraud is difficult to detect without access to raw data. Kennedy65 notes that those perpetrating fraud are often reluctant to share raw data for reanalysis: this certainly was the case with Soal, and may be more so now that the stochastic properties of ‘real’ data are better understood (for example Benford’s Law, which describes the frequency distribution of leading digits in many naturally occurring data sets).66 The case seems compelling for the establishment of a data repository, so that the evidence on which research claims are made can be scrutinized by anyone who has an interest, and this seems a natural extension to the design registries and repositories for unpublished research that are already available or are being developed.

(This is an edited version of an article that was first published in 2016 in Mindfield 8/1, 8-17, as 'Oh, what a tangled web we weave, when first we practise to deceive: The problem of fraud in parapsychology.)

Literature

Anonymous (1972). Peas on Earth. Horticultural Science, 72, 438.

Beloff, J. (1993). Parapsychology: A Concise History. London: Athlone Press.

Benford, F. (1938). The law of anomalous numbers. Proceedings of the American Philosophical Society 78/4, 551-72.

Bhattacharjeethe, Y. (2013). Mind of a Con Man. New York Times, 26 April.

Broad, W., & Wade, N. (1982). Betrayers of the Truth: Fraud and Deceit in the Halls of Science. London: Century Publishing.

Brookes, P.S. (2014). Internet publicity of data problems in the bioscience literature correlates with enhanced corrective action. PeerJ 2, e313.

Budd, J.M. (2013). The Stapel Case: An object lesson in research integrity and its lapses. Synesis: A Journal of Science, Technology, Ethics, and Policy, G47-G53.

Enserink, M. (2012a). Rotterdam marketing psychologist resigns after university investigates his data. June 25, 2012.

Enserink, M. (2012b). Diederik Stapel Under Investigation by Dutch Prosecutors. 2 October.

Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS ONE 4/5.

Fang, F.C., Steen, R.G., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. PNAS 109, 17028-33.

Goodstein, D. (2000). In defence of Robert Andrews Millikan. Engineering & Science 4, 30-38.

Grant, J. (2007). Corrupted Science: Fraud, Ideology and Politics in Science. Wisley, Surrey, UK: AAPPL Artists’ and Photographers’ Press Ltd.

Gross, C. (2016). Scientific misconduct. Annual Review of Psychology 67, 693-711.

Judson, H.F. (2004). The Great Betrayal: Fraud in Science. London: Harcourt, Inc.

Kennedy, J.E. (1975). Summary of rat implementation work. [Web pdf]

Kennedy, J.E. (2014). Experimenter misconduct in parapsychology: Analysis manipulation and fraud.

Kohn, A. (1986). False Prophets: Fraud and Error in Science and Medicine. Oxford: Basil Blackwell.

John, L.K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science 23/5, 524-32.

Lubalin, J.S., Ardini, M.E., & Matheson, J.L. (1995). Consequences of whistleblowing for the whistleblower in misconduct in science cases. Washington, DC, USA: Research Triangle Institute.

Markwick, B. (1978). The Soal-Goldney experiments with Basil Shackleton: New evidence of data manipulation. Proceedings of the Society for Psychical Research 56, 250-77.

Markwick, B., & West, D.J. (2018.). Dr Soal: A psychic enigma. Proceedings of the Society for Psychical Research 60.

Miller, S.J., (2016) (ed.). Benford’s Law: Theory and Applications. Princeton, New Jersey, USA: Princeton University Press.

Newton, R.R. (1977). The Crime of Claudius Ptolemy. Baltimore, Maryland, USA: Johns Hopkins University Press.

Price, M. (2010). Sins against science. Monitor on Psychology. July/August 2010.

Ranstam, J., et al. (2000). Fraud in medical research: An international survey of biostatisticians. Controlled Clinical Trials 21, 415-27.

Rhine, J.B. (1974a). Security versus deception in parapsychology. Journal of Parapsychology 38, 99-121.

Rhine, J.B. (1974b). A new case of experimenter unreliability. Journal of Parapsychology 38, 215-25.

Rhine, J.B. (1975). A second report on a case of experimenter fraud. Journal of Parapsychology 38, 306-25.

Roberts, D.L., & St John, F.A. (2014). Estimating the prevalence of researcher misconduct: A study of UK academics within biological sciences.

Shamoo, A.E., & Resnik, D.B. (2003). Responsible Conduct of Research. New York: Oxford Univiversity Press

Stapel, D. (2012). Faking Science: A true story of academic fraud. Translation 2014 by Nicholas J. L. Brown. [Web pdf]

Stapel, D.A., & Lindenberg, S. (2011). Coping with chaos: How disordered contexts promote stereotyping and discrimination. Science 332/6026, 251-53.

Steen, R.G., Casadevall, A., & Fang, F.C. (2013). Why has the number of scientific retractions increased? PLOS ONE 8:e68397

Stokes, D. (2015). The case against psi. In Parapsychology: A Handbook For the 21st Century, ed. by E. Cardena, J. Palmer, & D. Marcusson-Clavertz, 42-48. Jefferson, North Carolina, USA: McFarland.

Titus, S.L., Wells, J.A., & Rhoades, L.J. (2008). Repairing research integrity. Nature 19/453, 980-82.

Tucker, W.H. (1997). Re-reconsidering Burt: Beyond a reasonable doubt. Journal of the History of the Behavioral Sciences 33/2, 145-62.

Wade, N. (2010). Harvard researcher may have fabricated data. New York Times, 27 August.

Wells, J.A. (2008). Final report: Observing and reporting suspected misconduct in biomedical research. Washington, DC: Gallup Org.

Westfall, R.S. (1973). Newton and the fudge factor. Science 179/4075, 751-58.

Endnotes

1. Colman (1987).
2. Gross (85).
3. Stokes (2015).
4. Stokes (2015), 42.
5. Kennedy (2014), 9.
6. Interestingly he does not mention the four cases of critics described in the same article (102-104) whose behaviour could be regarded as instances of data mismanagement.
7. Rhine (1974a).
8. Rhine (105).
9. Kennedy (2014).
10. Stokes (2015).
11. Stokes (2015).
12. Kennedy (1975).
13. Rhine (1974b), 220.
14. Rhine (1975).
15. Beloff (1993).
16. Beloff (1993), 147.
17. Markwick & West (2018).
18. Haskell (1974).
19. personal communication, cited in Markwick & West (2018).
20. Markwick (1978).
21. Stokes (2015).
22. Markwick & West (2018), 176.
23. Rhine (1974a), 12.
24. Braud and Wade (1982), 11.
25. cf. Gross (2016), 694.
26. for further examples, see Braud & Wade (1982); Kohn (1986); Grant (2007); and Judson (2004).
27. Newton (1977).
28. Grant (2007), 20.
29. Braud & Wade (1982).
30. Westfall (1973).
31. cf. Judson (2004), 52-58.
32. Anonymous (1972).
33. Grant (2007), 20.
34. Braud & Wade (1982), 26.
35. Enserink (2012b).
36. Budd (2013).
37. Stapel & Lindenberg (2011).
38. Bhattacharjee (2013).
39. Bhattacharjee (2013).
40. Bhattacharjee (2013).
41. Levelt (2012).
42. This still only places him fourth on the Retraction Watch ‘leaderboard’, which is headed by Japanese anaesthesiologist Yoshitaka Fujii with 183 retractions.
43. Bhattacharjee (2013).
44. Tucker (1997).
45. Wade (2010).
46. Enserink (2012a).
47. Price (2010).
48. Braud & Wade (1982), 20.
49. Stapel (2012), iii.
50. John, Loewenstein & Prelec (2012).
51. Fanelli (2009).
52. Fanelli (2009), 7.
53. Titus, Wells & Rhoades (2008).
54. Ranstam et al (2000); Roberts & St John (2014); Wells (2008).
55. Steen, Casadevall, & Fang (2013).
56. Gross (2016).
57. Fang, Steen, & Casadevall (2012).
58. cf. Budd (2013).
59. Gross (2016).
60. Shamoo & Resnik (2003).
61. Lubalin, Ardini, & Matheson (1995).
62. Brookes (2014).
63. Gross (2016).
64. Braud & Wade (1982), 86.
65. Kennedy (2014).
66. See Benford (1938), Miller (2016).