Swimming By the Numbers: How Elite Women Swim the 200 IM


After a long delay, this is the next instalment in a series that looks at how the elite swim the 200 by looking at their race splits. This time we’re analyzing the 200 Individual Medley for women.

It’s taken me a long time to figure out how to analyze the 200 IM. When I analyze a stroke 200, such as 200 Backstroke, I have 50 and 100 metre long course personal bests for each of the top 24 swimmers to work with, and every one of those 24 can be expected to be incredible at Backstroke. But for an IM, every swimmer is not incredible at each stroke, and to do the same analysis I would need 50 metre PBs for each stroke for each swimmer. Those PBs just aren’t available. mainly because many of these IMers just didn’t swim stroke 50s at big meets.

That aside, I’ve come up with a way to look at the numbers, which brings us to this blog.

The Data

The data set consists of 24 elite swimmers at the time of the 2012 Olympics. I used the fastest 24 times and associated splits from the Olympics and the US Olympic Trials. For each swimmer, I used the fastest time they swam during the competition, and not just their last swim.

I also wanted a way to determine if the swimmer was sprint oriented or distance oriented, or in-between. So I added some analysis by gathering each swimmer’s 50 Free LC PB, 200 Free LC PB and 400 IM LC PB. These are swims that the vast majority of the elite 24 had. I was able to collect or estimate all of the 50 Free PBs, since most of these swimmers either swam a 50, 100 or relay 100 at a big meet in 2012 or 2011. Every selected swimmer has a 200 Free PB individually or on a relay. Interestingly, I had the most problems with the 400 IM. Six of the 200 IM elite 24 had no record of a 400 IM from 2010 to 2012. For these swimmers, I assumed that they were not particularly good at that distance, and so I gave them times that would put them in with the slowest of the 400 IM times.

The Analysis

The analysis is divided into 3 sections.

1) To determine which of the 50 Free, 200 Free and 400 IM events are the best indicator of success in the 200 IM. This will give an indication if the 200 IM is more sprint oriented, distance oriented, or in between.

2) Determine any common stroke strengths among these elite 24, and determine which strokes have the largest variations in splits (implying that some of the elite 24 are not as proficient in these strokes).

3) Determine how the top 8 differ from the other 16, and determine how the winner differs from everyone else.


How do the Elite Women Swim the 200 Individual Medley?

The first step here is to determine whether the 200 IM elite 24 are more sprint-oriented, more distance oriented, or firmly in between as hybrid swimmers. To this end, I use the 50 Free PBs (sprint), 200 Free PBs (hybrid) and 400 IM PBs (distance) to develop correlations with the final 200 IM times. The closer the correlation, the more the 200 IM favours that sprinter, distance or hybrid aspect.

I’ll explain the process with the 50 Free PBs. I rank the elite 24 in order of their 50 PBs, and then group them. The fastest 8 50 Free PB swimmers form the Top 50 PB Group, the next 8 form the Middle 50 PB Group, and the slowest form the Bottom 50 PB Group. I then compare the Top Group’s 200 IM average time with the average time of the fastest 8 200 IMs of the elite 24. I repeat this process with the Middle Group’s average 200 IM time and the average of the 9th-16th fastest 200 IM times, and the Bottom Group’s average 200 IM time and the slowest 8 200 IM times. The smaller the differences in the average 200 IM times for each group, the closer the correlation. As an example, if the Top 50 PB Group of 8 also had the 8 fastest 200 IM times, then the differences in averages would be zero, and there would be a very high correlation of 50 Free PB with the 200 IM, implying that 200 IM is largely a sprint event. As you can see below, this was not the case.

This process is then repeated for the 200 Free PBs and the 400 IM PBs. The results are here.

50 200 400 Correlation

Here we can see large total differences for all three of the distances (correlations with PBs for the stroke 200 analyses were all around 2.0 or lower). The largest difference is clearly for the 50 Free PB. Evidently the 200 IM is not strongly correlated with sprinting. The next highest is with the 400 IM PBs, indicating that distance is slightly more important than sprinting. And the best correlation was found with the 200 Free PB, which validates this correlation concept, as the 200 Free and 200 IM are both obviously the same distance and can be expected to have a similar sprint / distance / hybrid characteristics.

Next we’ll look at the individual strokes. As we all know, the 200 IM involves all 4 strokes, and therefore attracts virtually every combination of strengths and weaknesses in those 4 strokes. Our first analysis here is to determine if there are any consistently strong strokes across the elite 24, or any strokes that include large variations in ability. To do this, I calculated the average, fastest and slowest splits for the 24 swimmers for each stroke, and then express the variations in splits as a percentage of the average.

24 variation

We can see that the butterfly splits are pretty tight, with only 5.9% variation from fastest to slowest. No doubt this has something to do with everyone swimming it fresh, but it may also be that having a slow butterfly puts you in too big of a disadvantage right away. The other three 50s all shows roughly equal variations, and all roughly twice that of the butterfly split. Butterfly seems to be a common strength among the elite 24.

Next we’ll look at the correlation of each stroke with the final times in an attempt to determine which strokes are more important than others. To do this I carry out the correlation process from above, but instead of grouping by PBs, this time I group by 50 split for each stroke. This allows me to determine a correlation of how important each stroke is to the final 200 IM times.

Here are the results.

strokes correlation

Here we see better correlations than with the 50, 200 and 400 PBs, with one slightly better than the rest. The 50 Free split correlates the best with the 200 IM times, being slightly ahead of the other more or less equal correlations. Apparently, being able to finish fast correlates strongly with a good 200 IM time.


So what about the Top 8 200 IMers?

Let’s focus on the top 8 finishers, who happen to also be the Olympic finalists.

If we carry out the variation % calculation for each stroke for the top 8 just as we did for the elite 24, we find a completely different picture. When looking at the elite 24, we found relatively small butterfly range, but large ranges for the other three strokes. Here are the results when only looking at the top 8.

8 variation

Here we see relatively consistent splits for all, with the largest being for the Breaststroke split, and smallest for Backstroke. Basically, these 8 are good at all 4 strokes, with Breaststroke being the weakest for some.

Now, let’s get down to how Ye Shiwen won the race. Here’s a graph of the top 4 swimmers and their splits.


We can see that the first three 50s are pretty similar, other than Leverenz having a slower backstroke and much faster breaststroke. In fact, at the 150 mark the top 3 were separated by only 0.31, with Leverenz in the lead. and Rice trailing by 0.73. The race came down to the last 50 of freestyle, where Ye Shiwen pulled away from the rest. A close race no matter how you look at it.



The elite field of women tend to swim the 200 IM with more distance emphasis than sprint emphasis, and with 200 Free PBs being a better indicator of 200 IM time than 50 Free PBs or 400 IM PBs. In addition, the elite field are best characterized as all having relatively strong first 50 butterfly, but with much wider variations in times for the other three strokes. Lastly, being able to finish with a fast 50 freestyle split is strongly correlated with the best 200 IM times.

The fastest 8 are differentiated in that they have relatively small split ranges between them over the four strokes, indicating they are all quite good at all 4 strokes. Backstroke has the most consistency (a common strength) and breaststroke the least. Finally, Ye Shiwen won the IM by pulling away from the rest on the last 50.



One thought on “Swimming By the Numbers: How Elite Women Swim the 200 IM

I love comments, especially when they disagree with my view.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s