### By: Stanley Hsu

### Introduction:

The categories that come up when one thinks of a good baseball team are pitching categories such as ERA (earned run average), strikeouts, and WHIP (walks, hits per inning). Some important batting categories are batting average, on-base percentage, slugging percentage, on-base plus slugging, runs scored, and home runs. The best team must be near the top at most if not almost all of these categories. However, it will be interesting to see where the 2023 World Series Champions, the Texas Rangers, rank in the different categories. Since they are the champions, we would expect for them to be near the top for most of the statistical categories, since they are winning so many games. There are 30 teams in the MLB, so all of the rankings will be between 1 and 30.

### Part 1: Top Teams in different Statistics

**Pitching:**

We will now examine the top 3 MLB teams in the main pitching categories, which are ERA, strikeouts, and WHIP. A lower ERA should indicate a better team, and the top three teams in the MLB with the lowest ERA’s are the Milwaukee Brewers, San Diego Padres, and the Seattle Mariners. The Rangers are ranked 18th in the MLB with a team ERA of 4.28. This is surprising because this is below MLB average, which would be 15th or 16th because there are 30 MLB teams.

ERA: Earned run average is calculated by the following formula: 9 x earned runs / innings pitched. An earned run does not count a batter who reached on an error or on a catcher’s interference who later scores.

Because when a batter makes contact and the ball is put in play, there is a chance that the ball is not fielded by the infield or caught by the outfield. Therefore, for pitchers, strikeouts are very important. Strikeouts are certain outs that do not rely on the athleticism, positioning, or defensive ability of the team’s defense. Surprisingly, the Texas Rangers also rank 24th (7th worst) in the MLB in the number of team strikeouts.

Since ERA and strikeouts are calculated for a pitcher, WHIP is a good statistic to look at that also accounts for the defense of the team. WHIP is calculated by the following: (walks + hits) / innings pitched. This is an important statistic because a pitcher with a very low WHIP means that he is not giving up many walks or hits. In other words, it is very difficult for a batter to get on base against him. Surprisingly again, the Rangers rank 12th in the league for WHIP, which is close to average. Average would be 15th or 16th out of the 30 teams, and the Rangers are barely above average.

WHIP (Walks plus Hits per Inning Pitched): Add up the walks and hits and divide by the number of innings a pitcher has pitched. It measures how many baserunners a pitcher allows on average, and indicates how efficient a pitcher is.

It is very surprising to see that the Rangers are not a very good pitching team, because they do not rank at the top of the league for any of the three main pitching categories. 50 baseball experts filled out a questionnaire and they each voted for what they thought was the most important factor for winning baseball. Out of the 50 experts, 44 ranked pitching as the most important factor for winning baseball. Therefore, the Rangers defied the standard, since they only had mediocre pitching and won the World Series.

**Batting:**

We will now examine the top 3 MLB teams in the main batting categories, which are batting average, on-base percentage, slugging percentage, on-base plus slugging, runs scored, and home runs. Teams with a higher batting average would mean that their players get on base more often. This should theoretically correlate to winning because getting on base should lead to runs, and more runs help you win the game. The three teams in the MLB with the highest batting average are the Atlanta Braves, Tampa Bay Rays, and the Texas Rangers.

Batting Average: Player’s hits divided by total at-bats. However, at-bats do not take into account walks or hit-by-pitches.

It makes sense that the Rangers are 3rd in the MLB in batting average, because we would expect that the team who wins the World Series scores a lot of runs. To score runs, players need to get on base, and not counting walks, batting average is a great way to measure how often a team gets on base.

However, there should be statistic that includes walks, right? Yes, there is! That is a statistic called on-base percentage, and it is a very important batting statistic. This is because teams that score runs need to get on base, and on-base percentage tells us how often a team gets on base. The three teams in the MLB with the highest on-base percentage are the Atlanta Braves, the Los Angeles Dodgers, and the Texas Rangers.

On Base Percentage: How frequently a batter reaches base per plate appearance. This includes hits, walks, and hit-by-pitches.

Similarly, it makes sense that the Rangers are 3rd in the MLB for on base percentage. This means that the Rangers as a team reach on about ⅓ of their plate appearances, either by a hit or a walk or a hit by pitch. On-base percentage is an important statistic to determine if a team is getting on base, but it is not the best statistic, because it would weigh a double or a triple the same as a single. However, in baseball, a triple or a double is worth a lot more because a double then a single can score a run, but a single and then a single can’t score a run. Therefore, we need a statistic to account for doubles, triples, or home runs, otherwise known as extra base hits.

A good statistic that accounts for extra base hits is called slugging. Slugging is an important statistic because it values doubles, triples, and home runs properly. The teams with a higher slugging percentage should correlate with winning because the team is getting a lot of hits, and extra base hits. The top 3 teams in the MLB with the highest slugging percentage are the Atlanta Braves, the Los Angeles Dodgers, and the Texas Rangers.

Slugging Percentage: Calculated with the following formula: (1B + 2Bx2 + 3Bx3 + HRX4)/AB. Therefore, slugging percentage weighs home runs, triples, and doubles more than a single, and it does it by multiplying by the number of bases. A home run covers 4 bases, a triple 3 bases, and a double two bases.

Again, the Rangers are 3rd in the MLB in slugging percentage, which indicates they have a very solid offense. Some MLB experts like to look at a statistic called on-base percentage plus slugging, which adds the on-base percentage with the slugging percentage. The top 3 teams in on-base plus slugging will still be the Braves, Dodgers, and Rangers, since they lead in those two categories. One of the most important offense categories is runs scored, because if teams score more runs, it gives them a higher chance to win. This is because to win a baseball game, you need to score more runs than the other team. The top 3 teams that scored the most runs in the MLB are the Atlanta Braves, the Los Angeles Dodgers, and the Texas Rangers.

This is what we would expect because they lead the league in other offense categories as well. Since they get on base so often, we would expect that they score a lot of runs. Since they score a lot of runs, they should win a lot of games because you win games by scoring runs.

One way to score a run from a plate appearance is a home run, and home runs are important for an offense. Since singles, doubles, and triples depend on the batter after you to drive you in, you can create a run yourself with a home run. Therefore, it is important for a team to be able to hit home runs. The top 3 MLB teams in hitting home runs are the Atlanta Braves, the Los Angeles Dodgers, the Minnesota Twins, and the Texas Rangers.

The Rangers are tied for 3rd in home runs, so they do have power hitters who can score a run by themselves. This is needed in a strong offense, because sometimes batters will get a hit and the next batter will get out, such as by a double play or a strikeout.

**Part 2: How well certain Statistics correlate with Winning**

**Pitching:**

It is interesting to see that ERA does have a strong correlation with winning. There is a slight negative correlation, because the higher a team’s ERA is, the less wins they have. The Rangers are pretty far away from the regression line, so they are an example that shows that winning may not be strongly correlated with ERA. They had 90 wins but a 4.28 ERA. The Atlanta Braves had the most wins out of any team in the MLB in 2023 (104), but their ERA was 4.14. That ERA is not much higher than the Rangers, since it is 15th in the MLB. The Milwaukee Brewers had the best ERA in the MLB, with a 3.71 ERA, but only had 92 wins. The Brewers had a 3.71 ERA, the Rangers had a 4.28 ERA, but the Brewers only had two more wins than the Rangers.

It is also interesting to see that strikeouts do not have as strong of a correlation to winning as ERA does. This makes sense intuitively, because there are many other ways to get outs in the MLB besides strikeouts. Many great pitchers get outs by inducing weak contact, which can be groundouts or short flyouts. Weak contact could arguably be better than strikeouts, because a pitcher throws less pitcher and can thus go deeper into games and get more outs. It is interesting to see that the Twins had the most strikeouts, but have less wins than expected, while the Rangers and Braves both had more wins than expected for their number of strikeouts.

Out of the main pitching categories, WHIP actually has the strongest correlation with winning. It has the highest correlation coefficient, and both the Braves and the Rangers have more wins than expected based on their team’s WHIP. The Rays have the lowest WHIP, and they have a lot of wins which is expected.

**Batting:**

Batting average has a strong correlation with winning, and it is seen because the Braves have the most wins in the MLB, and they have the highest batting average. The Rangers have a lot of wins and are at the top of the league for batting average as well.

On-base percentage has an even stronger correlation with winning, with a correlation coefficient of almost 0.80. The Braves have the highest on-base percentage, and they have more wins than expected, while the

Rangers have less wins than expected given their on-base percentage.

Slugging percentage has a very strong correlation with winning, because it has a correlation coefficient of almost 0.70. The Braves actually have less wins than expected given their slugging percentage, and the Rangers do as well. The Braves are first in the league for slugging, so it makes sense that they have so many wins.

Runs scored is a very important category for batting and offense, so it has a very strong correlation with winning. The correlation coefficient is about 0.8, and the Braves have about as many wins as expected. Given the amount of runs the Rangers have scored, they have gotten more wins, which indicates they score a lot of runs.

Surprisingly, home runs does not have a strong correlation with winning. However, a reason for this is because there are many other ways to score runs. Teams can get many hits, and move runners over and hit a sacrifice fly or groundout to score a run as well. The Braves and Rangers both have the amount of wins we would expect given the amount of home runs they have.

**Conclusion:**

Based on my analysis and visualizations, we can conclude that to win, it is not that important to be at the top of all of the statistical categories. The 2023 MLB World Series Champions Texas Rangers were mediocre for ERA and strikeouts, which is very surprising. This is because many believe that one of the most important, if the most important, part of baseball is preventing runs, and the Rangers do not do that well. This means that the Rangers should not have won the World Series, but they might have played better in the postseason, which led them to winning more. Statistically speaking, the Atlanta Braves are the best team in baseball. They rank #1 in many offensive categories, and are above average for pitching. However, they did not win the world series so the best team does not always win for baseball.

## Comments