Analyzing Bat Tracking Data in Baseball

Getting in the Weeds With Bat Tracking


Charles LeClaire-USA TODAY Sports

Like many other nerds, I have devoted a lot of time to slicing and dicing Baseball Savant’s new bat tracking data over the last few weeks. And like many other nerds, I’m not entirely sure how we’ll end up using this wealth of new information. More time, more data, and more brain power is needed to wring out whatever sweeping new truths it may hold. I’m going to write about bat tracking data in a more focused way next week.

There are a couple things I think are really interesting; not necessarily new information, but ways that bat tracking data can give us hard numbers for things that we’ve already learned. In this article, I’ll be a bit more scattershot. I’d just like to take you through how I’ve processed all the information that has come out over the last few weeks.

First off, bat tracking will give us new stats that stabilize more quickly than existing ones, as that’s how granular metrics that separate underlying skills from results tend to work. In smaller samples, exit velocity turned out to be a better predictor of overall batting performance than wRC+ or wOBA. Now we have swing speed, which in smaller samples turns out to be a better predictor of exit velocity.

To wit, I pulled data from the first week of bat tracking, April 3 to April 9, and compared it to each player’s overall numbers this season. I eliminated any player with fewer than five plate appearances during the first week or fewer than 100 PA during the entire season, which left me with a sample of 295 players. It was no contest. Full-season exit velocity had a much stronger correlation to first-week swing speed (R = .60) than it did to first-week exit velocity (R = .40). It also predicted full-season hard-hit rate better than first-week hard-hit rate (R = .66 for swing speed, compared to R = .46 for hard-hit rate). If, after the first week, you want to know who’s going to hit the ball hard for the rest of the season, don’t look at exit velocity. Look at swing speed.

That said, I’m not positive that this particular way of looking at bat tracking data will help anyone. We’re probably breaking things down too finely here. After all, swing speed doesn’t have that strong a correlation to overall success at the plate, much lower than exit velocity. If we go back to our first-week stats, swing speed has a slightly lower correlation (R = .19) to full-season wOBA than exit velocity or hard-hit rate (R = .21 for both). It can tell us sooner how hard a player is capable of hitting the ball, but it’s not any quicker at telling us how well they can hit.

Second, I’ve heard smart people say that this data could prevent injuries. If fatigue, tightness, or tenderness is keeping you from swinging as hard as you normally would, a watchful analyst could spot it in the numbers and prescribe rest before you hurt yourself. While this makes a certain amount of sense, I’m skeptical for now. People have been trying to do the same thing with pitchers for years, monitoring stride length, extension, release point, velocity, spin rate, and break for indications of fatigue or compensation.

For some quick anecdotal research, I checked two prominent players with recent injuries: Ronald Acuña Jr. and Steven Kwan. Not that this means anything, especially with two lower body injuries, but both Acuña and Kwan were actually swinging slightly harder against four-seamers in the week before they got injured than they had been earlier in the season.

So far, my biggest takeaway is an obvious one: Bat tracking is very complicated. There are so many factors that affect swing speed and length, and if you’re trying to learn anything, you need to select your variables very, very carefully to make sure you’re comparing apples to apples. If you want to analyze swing speed, you need to make sure that you’re accounting for pitch type.

Since the sweet spot of the bat generally starts out somewhere above and behind the batter’s back shoulder, it has to travel a greater distance to reach a slider low and away than a fastball over the middle. If you’re swinging at an inside pitch, you’re more likely to meet the ball out in front, which means a longer swing. So a player who chases too many breaking balls is likely to get dinged for a long swing, as is a right-handed Astro who makes a living pulling balls into the Crawford Boxes. One of those is a bad thing, and one of those is part of the reason that Jose Altuve and Alex Bregman are perennial All-Stars.

Here’s an example of the struggle to find a representative sample. While smarter people were figuring out the things I just told you, I was wondering about the strength of the relationship between swing length and the height of the batter. After all, there’s a reason we expect bigger players with longer levers to hit for more power.

If you glance at Baseball Savant’s main bat tracking leaderboard, you’ll see that Oneil Cruz has one of the longer swings in the game, which isn’t surprising since he’s one of the longer people in the game. However, if you drill down to get a more representative sample, things change.

Let’s say you look only at competitive swings on middle-middle fastballs that resulted in balls hit straightaway. We’ve cut our sample way down, but we’re doing our best to control for the type, speed, and location of the pitch, as well as the depth of contact. If we focus on these pitches, it turns out that when he’s not flailing at breaking balls, Cruz has a surprisingly short swing, below the big league average in this particular split.

However, this may not be the right way to look at things. Maybe Cruz’s numbers look too rosy once we’ve thrown out his many, many whiffs. Maybe we should only be looking at whiffs. After all, if we just look at whiffs, we don’t have to worry about accounting for depth of contact, because there is no contact. That’s a huge variable eliminated.

When I looked just at whiffs on middle-middle fastballs, Cruz’s swing length was no longer below average, although it was still relatively short for such a tall player. No matter how I sliced it, I tended to find that height and swing length had a correlation coefficient between .24 and .35.

Still, as with so many of my deep dives into bat tracking data, I’m not completely sure how to make all of the parts combine into a cohesive whole. In this example, it made a lot of sense to look only at whiffs, but at the same time, it seemed ludicrous to judge a player’s swing speed, which shows how much damage they can do on contact, by throwing out all the swings where they actually made contact!

I suspect that bat tracking will be used in one particular way very quickly. We’ve all read articles about teams telling their pitchers to trust a certain pitch because it’s nastier than they realize. They’ll now be able to point to a specific number. Let’s say you’re the Rays and you want Garrett Cleavinger to throw his four-seamer more often. He might be more likely to buy in if you tell him that batters are swinging three ticks softer against it than they are against his cutter and his sinker.

Whiffs are great, but knowing that batters can’t even get a good swing off against a pitch might be just as strong a motivator. As I said at the top, these are just my first takeaways as I sort through the data and process what…