Analysis: Are We Entering the Age of the Super-Blockbuster?

July 19, 2012


Christian Bale as Bruce Wayne in Warner Bros. Pictures' and Legendary Pictures' action thriller "The Dark Knight Rises," a Warner Bros. Pictures release. TM & © DC Comics.

Steven Zeitchik recently posted an interesting article on the L.A. Times web site that discusses the possibility that we have entered the age of what he calls the "uber-hit ... movies that go beyond modest success to dominate the multiplex, often leaving other contenders far behind."

Statistical analysis can provide us with some insight into whether there really has been a sea change in the industry.

The idea is that we increasingly see either huge hits like The Avengers, The Hunger Games and (I don't think I'm sticking my neck out too far on this one) The Dark Knight Rises, or relative flops, like John Carter and Battleship, and that the hits tend to dominate at the box office for longer.

Mr. Zeitchik summarizes:

Now we're seeing a further evolution -- or, more accurately, a subdivision, a kind of box-office mitosis in which even among the already-select group of tent poles, only a handful actually work. When they do, they work big, winning the box-office crown week in and week out. The others? They can't even finish in second place on their opening weekends, as "Rock of Ages" and "That's My Boy" failed to do on Sunday.

I think anyone who follows the industry would agree that he might be on to something here, and one of the nice things about the idea of the "uber-hit" (or what I'm going to call the "super-blockbuster") is that it's something we might be able to verify using statistical analysis...

What we need to do to verify the idea is to show that the performance of big hits in comparison to other films has recently been different to their relative performance in earlier times. That prompts two questions: First, what timeframes should we consider? And, second, what methods could we use to detect a difference?

The L.A. Times article uses the last three years as a benchmark, and cites Avatar as its earliest example of a super-blockbuster. This seems reasonable to me, although there is an argument, I think, that Transformers: Revenge of the Fallen marks the beginning of the trend. Let's start out by assuming that Avatar was first film to display this characteristic.

A couple of approaches to testing this theory occur to me. The L.A. Times article uses the number of weekends a film spends at number one as its benchmark. An alternative would be to look at whether the highest-grossing movies are over-performing in terms of total box office compared to years gone by. In this article, I'll look at the former possibility. I'll save the latter analysis for another time.

So, let's start by looking at how many weeks films have spent at the top of the chart since the release of Avatar. The chart below shows the top few entries, with Avatar leading with seven weeks, followed by The Hunger Games at four weeks, and several movies with three-week runs. The full table can be seen here, and downloaded as a CSV file here.

A total of 91 films have topped the chart since Avatar, so we have 92 films in our sample set. A reasonable comparison would be to find the 92 films that topped the chart prior to Avatar and see if the distribution is different. The chart below shows the top few entries from this list, which is headed by The Dark Knight, with four weekends at number one, and has just two more movies with three weeks at number one: National Treasure: Book of Secrets and Tropic Thunder. The full table can be seen here, and downloaded as a CSV file here.

A quick glance at the lists suggests that there really is a difference, and we can verify whether it is statistically significant using what's know as a t-test. Using the CSV files linked to above, and the statistical package R:
> pre_avatar <- read.csv("scoqksdis.csv")
> post_avatar <- read.csv("ompcuolhgt.csv")
> t.test(pre_avatar$r_weeks_at_number_one, post_avatar$r_weeks_at_number_one)

	Welch Two Sample t-test

data:  pre_avatar$r_weeks_at_number_one and post_avatar$r_weeks_at_number_one 
t = -2.1136, df = 151.091, p-value = 0.03619
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -0.44163552 -0.01488622 
sample estimates:
mean of x mean of y 
 1.239130  1.467391 

In other words, the probability that the average number of weeks spent at number one has not increased since the release of Avatar (the so-called p-value) is 0.036. Or, to put it another way, the probability that the average has increased is about 96% -- just over the 95% threshold of what's considered significant by statisticians.

Note, however, that we've put our thumbs on the scale a little by picking the release date of Avatar as the starting point for our analysis. What if we instead compare the last 100 films released with the previous 100? Using a slight modification to our original query (downloadable here and here...

> post_avatar_100 <- read.csv("Post100.csv")
> pre_avatar_100 <- read.csv("Pre100.csv")
> t.test(pre_avatar_100$r_weeks_at_number_one,post_avatar_100$r_weeks_at_number_one)

	Welch Two Sample t-test

data:  pre_avatar_100$r_weeks_at_number_one and post_avatar_100$r_weeks_at_number_one 
t = -2.0858, df = 164.936, p-value = 0.03853
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -0.40878679 -0.01121321 
sample estimates:
mean of x mean of y 
     1.23      1.44 

So the probability of these figures happening by chance increases to 3.9%. However, this still meets the threshold for "statistical significance." (By the way, for stats nerds, I get similar results using the Mann-Whitney U Test, which is arguably more appropriate for this analysis.)

So, Mr. Zeitchik does seem to have a point. Runs at number one do seem to have become longer in the past 2-3 years. The difference, however, is fairly small. Based on the analysis above, the average run at number one has grown by somewhere between 0.01 and 0.41 weeks. If we say that the average has increased by 0.20 weeks, then one in five films that would previously have spent only one week at the top of the chart are spending two weeks as the most popular movie.

A word of caution. With a change this small, it's quite possible that we're only seeing statistical noise, even though it is technically a statistically significant result. Using a 95% probability as our threshold means that out of 20 randomly-selected metrics about the industry, one should show a result like this at any given time. So we might just have cherry-picked a piece of data that doesn't signify very much. When we do find something like this, then, it calls for closer inspection.

On that note, in another column I'll take a look at what might have caused this change. I'll also be doing some more analysis to see if the difference can be measured in other ways.

Bruce Nash

Filed under: Analysis, The Dark Knight Rises