Why association is only half the story

I want to develop a point from Jon’s earlier post. A central theme of this project is that association (a correlation found in a drug trial, for example) is only half the story about causation. As Jon mentioned, there are many reasons that an observed correlation might be non-causal (like sampling errors, confounding, and so on). Here, I want to explore a case where a non-causal correlations was taken as sufficient reason for accepting a causal claim.

Cervical cancer

Cervical cancer is caused by infection with human papillomavirus (HPV). This claim was first made in the early 1980s by Harald zur Hausen, a German virologist. You can have a look at the original paper (Durst et al, 1983), as well as some information about the half share of the 2008 Nobel prize in Physiology or Medicine which he won for this work. When I started studying medicine in the late 1990s, the causal link between HPV and cervical cancer was common knowledge. So when I began researching the history of cervical cancer for my PhD (which you can read online if you’re keen on that kind of thing), it was a shock to discover that HPV was not the only virus that had been associated with cervical cancer.

Between about 1970 and 1985, herpes simplex virus (HSV) was generally accepted as the cause of cervical cancer. For example, you can peruse the forty or so papers that make up the proceedings of the 1972 American Cancer Society conference ‘Herpesvirus and cervical cancer’ in Cancer Research, which demonstrate the existence of a thriving research program on HSV and cervical cancer. I’ll discuss the significance of this below, but for now I want to introduce the question that first bothered me when I started this research: why did anyone think that HSV might cause cervical cancer?

HSV and cervical cancer?

The roots of the claim that HSV might cause cervical cancer came from some observed correlations between certain sexual behaviours and the risk of developing the disease. In fact, cervical cancer has long been noted to behave more like an infectious disease than a typical cancer. Perhaps the most interesting series of observations of this kind was produced by Rigoni-Stern in 1842 (available in English translation as Stavola, 1987), which described a series of cases in Verona (1760-1839) that showed much higher rates of cervical cancer in married women than in nuns. One possible explanation for this difference was the celibacy practised by the nuns. Other studies during the nineteenth and early twentieth centuries found that other behaviours related to sex also seemed to modify the chance of developing cervical cancer. In general, the more sex an individual had had, the greater their risk of getting cervical cancer. So being married, having sex in adolescence, contracting other sexually transmitted infections (like syphilis) and having a large number of children positively correlated with the disease, while abstinence from sex negatively correlated with the disease.

By the time that mass population screening for cervical cancer was introduced in the mid-twentieth century, these sexual risk factors had been extensively researched. One great quote from the Aberdeenshire cervical cancer research project sums up the thinking typical at the time:

The cancer patient is characterised by more marital misadventures, divorce and separation, more pre-marital coitus and deliveries and more sexual partners. (Aitken-Swan and Baird, 1966: 656)

So perhaps cervical cancer was a consequence of a sexually transmitted disease. While the usual suspects (syphilis and the like) did not seem to account for it, research in different contexts suggested that herpes viruses might cause many kinds of cancer. The details of this are rather complicated (and probably something for another post), but the upshot was that (in the mid-twentieth century) herpes viruses seemed the most likely suspects as causes of cancer in humans. Happily for researchers at the time, this seemed to provide a causal explanation for the correlation between (sexually transmitted) HSV and cervical cancer (see, for example, Kessler, 1976).

Not much of a mechanism

So infection with HSV was an attractive explanation for these sexual risk factors. But was it also the cause of cervical cancer? Well, the lack of correlation between other sexually transmitted infections with cancer of the cervix suggested that correlation wasn’t just an accident, but was instead due to a causal relationship (Rawls et al, 1973: 1482). Other evidence, like serology, the mutagenic power of HSV, the detection of fragments of HSV DNA in cervical cancer cells, and the causal role played in other tumours by herpes viruses, seemed to support this causal claim. Yet its details remained elusive. Most of the papers from the 1973 Cancer Research volume mentioned above tried, but failed, to detect some specific evidence of a the mechanism linking the virus with the disease. And the details of this mechanism remained elusive, as we might expect. Yet the claim that HSV caused cervical cancer persisted well into the 1980s, and lead to significant resistance when other causal claims (like that involving HPV) were mooted. In conclusion, when combined with the plausibility of possible mechanisms involving HSV, the correlation between HSV infection and cervical cancer meant that it was unthinkable that HSV did not cause cervical cancer.


It’s pretty uncontroversial to say that we should distrust brute correlations, or mistake a correlation for a causal relation. But there are other, more subtle, issues that this case raises that I think we should be similarly mindful of. The first of these is the difference between plausible, and actual, mechanisms. HSV was linked to cervical cancer by an extremely plausible mechanism. But no actual mechanism was found. Good mechanisms in this context are specific and local: and we should be extremely cautious about mechanisms that are purely plausible. The boundaries here are pretty vague, though, and a future research goal for me is to try and come to grips with the difference between plausible and actual mechanisms.

The second issue that I’d like to raise by way of conclusion concerns being explicit about causal evidence. The HSV case is an example where, despite a great deal of research, no specific evidence mechanistically linking HSV and cervical cancer was found. However, this lack of evidence is not readily apparent from individual papers in the literature. Health researchers have recently adopted many strategies to more effectively review evidence of correlations (like meta-analysis and systematic reviews of trials). I imagine that a similar strategy for explicitly considering evidence of mechanism would have been valuable for HSV researchers as a way of detecting a persistent absence of evidence in the face of inquiry.


Aitken-Swan, J, and Baird, D. 1966. “Cancer of the Uterine Cervix in Aberdeenshire. Aetiological Aspects.British Journal of Cancer, 20(4): 642–59.

Dürst, M, Gissmann, L, Ikenberg, H, and zur Hausen, H. 1983. A papillomavirus DNA from a cervical carcinoma and its prevalence in cancer biopsy samples from different geographic regions. PNAS 80(12): 3812–3815.

Kessler, II. 1976. Human Cervical Cancer as a Venereal Disease. Cancer Research. 36: 783-91.

Rawls, WE, Adam, E, and Melnick, JL. 1973. “An Analysis of Seroepidemiological Studies of Herpesvirus Type 2 and Carcinoma of the Cervix.Cancer Research, 33(6): 1477–82.

Stavola, BD, 1987. “Statistical Facts about Cancers on which Doctor Rigoni-Stern based his Contribution to the Surgeons’ Subgroup of the IV Congress of the Italian Scientists on 23 September 1842. (translation).Statistics in Medicine, 6(8): 881–4.