Newsletter #90–Feb./Mar. 2006

Surrogate Markers in Breast Cancer Research:
A Clinician’s Report on the 28th Annual San Antonio Breast Cancer Symposium

by M. Ellen Mahoney, MD

Winter holiday preparations came to an abrupt halt again this year as the time arrived for the San Antonio Breast Cancer Symposium (SABCS), which takes place in December each year. It is an intense conference, with meeting hours from 7 a.m. to 10 p.m. each day. It is possible to connect briefly with many old friends, but there is little time for idle chatter unless you agree to skip a session together. I go primarily to get some perspective on the topics that will be controversial in the coming year.

Recently the agenda has been replete with “interim analyses,” tempting clinicians to change practice before the complete story is in on a potential new treatment. Some colleagues ponder the balance between applying new approaches too early and being so cautious that patients are deprived of promising new strategies. Many, however, seem to view the words spoken at the symposium as established truth, and rush home to try the newest combination as soon as possible to establish themselves as the local authority on what is hot in breast cancer. For women with breast cancer, the result of progress is often more treatment, but little critical analysis is given to diminishing returns or the expense of new therapies. Often by the time the analyses are complete, the effects seen at the first interim analysis—that had looked so promising—are much less impressive. Meanwhile, back home, practice has already changed.

Just a month before SABCS, an important review article in a November 3, 2005 issue of the Journal of the American Medical Association (JAMA)1 looked systematically at the impact of stopping clinical trials early “for benefit.” Coming as it did between the October 2005 release of the new studies expanding the indications for the use of Herceptin and SABCS, I could not help but wonder if the timing was deliberate. “Stopping early” has become a common event in recent breast cancer trials, especially big and important ones, and the impact has been huge.

The ATAC (Arimidex, Tamoxifen Alone or in Combination) trial, for instance, which I mentioned in my last dispatch to BCA,2 has been responsible for the hugely increased use of aromatase inhibitors, despite the fact that the initial report was published after a median follow-up of 18 months, and when fewer than 1 percent of the women randomized in the trial had finished the course of treatment to which they were assigned. Conscientious clinicians have a persistent anxiety about the unknown long-term side effects of these now widely used medications.

The JAMA review discusses the fact that trials stopped early are more likely to be industry-funded. The percentage of early-stopped trials increased from 0.5 percent from 1990 to 1994 to 1.2 percent from 2000 to 2004. On average, they had recruited 63 percent of their planned sample and stopped within 3 to 25 months (median 13) after one interim analysis, and when a median of 66 (23 to 195) patients had had an “event” defined as an endpoint of the study. Ninety-four percent (135 out of 143) omitted important details, such as the planned sample size and information on how the decision to stop early was made.

The trials with the fewer events showed the bigger treatment effects, sometimes to the point of implausibility, according to the authors of the JAMA article. It was noted that investigators, patients, and funders may have different but “convergent” reasons for stopping early. The practice of stopping trials early is expected to become even more common as we become more used to seeing it. In particular, it is tempting to underestimate risk and to overestimate benefit when information is incomplete. Reforms were suggested, and readers were urged to view results of any of these trials with skepticism.

In spite of such urges for skepticism, believe me when I say that the effect is often the opposite. I sit on tumor boards in which experienced clinicians justify treatment recommendations by referring to trials that were stopped early for “ethical reasons” (meaning that the benefit was so huge that it was immoral not to change practice immediately). A similar tone of moral obligation is raised when these studies are mentioned in big meetings like SABCS, as participants reminisce about these “landmark” studies.

The Herceptin studies reported in an October 2005 New England Journal of Medicine3 were well designed, and the safeguards on data management were generally in place. But when I read them, I was uneasy—not only because I felt skeptical but also because I wondered what was wrong with me that I could not embrace the results with the enthusiasm of my colleagues. When the JAMA article was published in November, I felt better about my concern, but I still could not put my finger on the basis for my continued suspicion and caution. Maybe it was other statistical problems with clinical trial design such as “intention to treat” (when a patient is kept as a member of the group to which they were randomized even if they drop out). Or maybe it was something else. Then in December I went to San Antonio and it became clear.

I have written before about the appearance of impropriety that taints huge scientific meetings like SABCS and the American Society for Clinical Oncology’s annual conference. As always, the limousines for major academic figures, paid for by industry, were lined up at the beginning and the end of the meeting. There were lavish industry-sponsored parties at night. The Herceptin study authors held court at noontime and appeared on the JumboTrons to great acclaim during the meeting.

But the biggest change was seen in the freebies at industry exhibits. Forget pens, pads of paper, and Hershey kisses. Remember when a tote bag was the big catch? At Genentech’s booth (the company that manufactures Herceptin), the atmosphere was nothing short of giddy. For little or no effort, the thousands of symposium participants could score free USB flash drives, travel alarm clocks, and a keychain device that one can use to test the strength of the wireless signal in the vicinity. Herceptin is very expensive, and the indications for its use had just expanded markedly.

When I was in medical school, we had not allowed drug companies to give us stethoscopes for ethical reasons. We’ve come a long way, but toward what? (And I confess I did get a memory stick and clock as gifts for my son.) Then it hit me.

We get so used to hearing the “events” that are used as endpoints in studies. We talk about them without thinking about whether they are meaningful in a given context. I started discussing this with startled colleagues, all of whom expressed some wonder that they hadn’t really thought about them for a long time either.

Dr. Susan Love’s Breast Book describes three endpoints (“the outcome that is being compared in two cancer treatments”) concisely. Overall survival refers to the time frame from “the beginning of the subject’s entry into the study to her death from any cause.” Disease-free survival (DFS) has become the most common surrogate endpoint. DFS “measures the time from randomization to the first evidence of recurrence or death… This figure measures the number of women at a particular time who have no recurrence of breast cancer in the breast, chest wall (after mastectomy), or elsewhere in the body.” An additional endpoint is distant disease-free survival (distant DFS), which I believe is the only valid surrogate endpoint for cancer drug trials. Distant DFS “indicates the time until the first recurrence outside the breast” and measures “how many women are alive without metastases at a particular point in time.4

In DFS, “events” that terminate a patient’s participation in the trial and drive the trial closer to completion roll right off the tongue. The list of events can include local, regional, and distant recurrence; contralateral breast cancer, including DCIS; other second primary cancers; and death from any cause before a recurrence or second primary cancer. If a patient experiences one of these, they are terminated from the study (counted as one event). When a predetermined number of events (this number is often arbitrary) is reached in the trial, an interim analysis is done. If the new treatment shows a benefit at this point, the study is released and practice changes. Sometimes the study continues, more often it doesn’t.

Using meaningless endpoints serves only to generate events and to end studies early, before we know much about the benefits or downsides. Interim analyses are often used to support changes in clinical practice, but I am troubled by the fact that the endpoints often have nothing to do with the logical action of the proposed treatment. An arbitrary predetermined number of events that favors ending the trial early may make the new treatment appear more beneficial than it truly is.

Let’s look at the latest Herceptin trials, for example, the talk of which dominated SABCS. There were some subtle differences between the three trials involved, and for details, you should read the papers, which can be downloaded from www.nejm.org. Bear in mind that we are talking about short follow-up periods, one to two years, and the number of “events” that triggered the stopping point were under the control of the committee who designed the study. These studies should be praised for the fact that the number was at least determined and announced in advance. The HERA (Herceptin Adjuvant) trial reported 347 “events” in 1,693 women, and in the other combined trial there were 394 “events” in 3,351 women.

If the chemotherapy used in the trial is known to have no effect on other cancers, why should the development of those cancers count for or against it? This is especially true for a specifically targeted drug like Herceptin. If a patient gets thyroid or colon cancer, for instance, that “event” by definition is not related to Herceptin. It functions to drive the trial more swiftly toward its end, saving money and accruing all the benefits for the investigators and sponsors that I mentioned in the beginning of this column. For similar reasons, if participants die in an auto accident, how can this possibly be used to justify pushing the trial to completion? On the other hand, especially in the case of Herceptin, I could accept the development of congestive heart failure as an event, but somehow events like this usually don’t make the cut.

It’s important to remember that women are given chemotherapy, including Herceptin, to reduce the risk of metastatic disease and prolong their life. It is not used to treat breast cancer in the breast itself, which is why distant DFS is the only valid endpoint for Herceptin.

Unless the event used as an endpoint has some possible relevance to the known or plausible effects of the drug being tested, its function can only be seen as an event that is designed to increase the chances of creating an “early stopping.” As the JAMA review points out, the more advantages seen in “early stopping,” the more likely we are to see more of this in the future. Since study designers are already using irrelevant end points, will this embolden them to add more? How far can it be taken?

I am a wholehearted supporter of research, but I want to see it designed care-fully, conducted with honesty and relevance, and I want to know about important side effects before I change my clinical practice to incorporate it.

Using distant DFS instead of DFS will have the primary effect of increasing the time it takes to complete a trial. True, it will raise the cost, but it will also give us more time to see downside risks, which may take longer than a few months or a few years to develop, and to give us information that otherwise will only be learned on the backs of the patients forced to make decisions on incomplete information. It would strike a balance in the dissemination of possible benefits much earlier than if overall survival were used as the endpoint. It won’t solve all the problems or tell us all the contraindications, but it would be a big step in terms of increased safety.

Herceptin dominated the news at San Antonio, and the presentations there will almost certainly drive more use of the drug, despite the fact that only interim analyses were presented, overall survival was not affected, and long-term downside is not completely known. There was some concern expressed that the cardiotoxic effects may not be as mild and reversible as previously thought, especially when anthracyclines are used concurrently with Herceptin.

Additional Topics at SABCS

One interesting SABCS presentation suggested that greater precision in characterizing individual tumors may lead to better individualization in treatment. For instance, patients whose tumors have both the HER2/neu oncogene and the topo II gene benefit from the combination of an anthracycline and Herceptin, but those whose tumors don’t overexpress topo II do not benefit from the combination and perhaps should not be exposed to the greater risk of the combination. This type of research would lead to more precision in treatment, acknowledging that all tumors are not the same biochemically and giving us direction on how to exploit the differences in order to refine strategy.

Avastin, a drug designed to target the process by which tumors recruit their blood supply from surrounding normal tissue, was combined with Herceptin in one study presented. Avastin had not proven helpful by itself in previous studies, but when combined with Herceptin for women with HER2/neu positive metastatic tumors, it demonstrated improved progression-free survival, from 6.1 months to progression for women who received Herceptin only, to 11.4 months for those who received the combination. The need for a better and more equitable medical system is highlighted by the development of these sophisticated and targeted therapies, as the estimated annual cost of the combination is $130,000 and raises issues of access, which we should be working to resolve now.

The interim analyses of the use of aromatase inhibitors (AIs) and tamoxifen continue to show an increase in DFS with the aromatase inhibitors, and supported the trend toward switching patients to AIs as a first line therapy. The absolute benefit of the AIs over tamoxifen is not dramatic, but it is consistent, as is the increase in fractures and osteoporosis seen in the women taking AIs, highlighting the need for close monitoring of the bone density of women taking these drugs.

An exciting presentation in the last plenary session focused on tomosynthesis, an extension of digital mammography techniques, which should be commercially available in two to five years. Tomosynthesis allows digital images to be viewed as thin slices similar to a CT scan, obviating the need for compression views or other techniques used to clarify mammographic findings. Because the breast is a three-dimensional object being projected onto a two-dimensional film or screen, masses seen can actually be “summation artifacts” that actually exist in different areas of the breast, but which stack up on each other in the image, forming a pseudomass or pseudocluster. Compression and rolled views are used now to try to separate the components of the summation artifact from each other.

With tomosynthesis, very specific thin slice images of the breast from different angles will be available by computer manipulation of the data captured by the mammogram. This separates mammographic findings from surrounding dense tissue and other artifacts, leading to more accurate and earlier diagnosis of abnormalities. Less compression, more comfortable procedure, less radiation, and more accuracy are attributes that can’t be beat in breast cancer detection.

Nothing new was presented this year on risk reduction and nothing so far on the big goal of primary prevention. One day a technique for prevention will make as big a splash as Herceptin did this year. Now that would really make the winter trip to Texas in the middle of winter holiday preparations worthwhile.

Ellen Mahoney is a breast surgeon and activist.

1 Victor M. Montori, et al., “Randomized Trials Stopped Early for Benefit,” JAMA 294, No. 17 (November 2, 2005): 2203-2209.

2 “Clinical Perspectives on the 27th Annual San Antonio Breast Cancer Symposium,” BCA Newsletter #85, March/April 2005.

3 Edward H. Romond, et al., “Trastuzumab Plus Adjuvant Chemotherapy for Operable HER2-Positive Breast Cancer,” NEJM 2005;353:1673-84 and Martine J. Piccart-Gebhart, et al., “Trastuzumab After Adjuvant Chemotherapy in HER2-Positive Breast Cancer,” NEJM 2005;353:1659-72. Online at www.nejm.org.

4 Susan M. Love, Dr. Susan Love’s Breast Book, 4th Edition, 2005, p. 251-254.