Saturday, February 16, 2008

Predicting Saves Phase #1 - For Fantasy Baseball Purposes

When most look at closers, it is obvious that some of them benefit by being named the closer. I mean, shoot, exhibits A and B should make my point clear: Armando Benitez and Jorge Julio. Sure, they have performed in that role in the past, but at some point they were no longer fit for the role. In this regard, I have decided to evaluate relief pitchers (RP) in terms of performance and usage. I am going to be using stats from THT and Fangraph in this regard. The variables that I will be using either are directly from these sites or are slightly modified:

Runs Created per Innings Pitched

K's per game

BABIP (DER)

FIP

K/BB

pLI (Leverage Index)

Save Opportunities

Holds

Games Played

First of all, the main contention here is that saves are... somewhat predictable. Obviously, one of the things that needs to be reviewed is past performance. In that, save opportunities are fickle. I prefer to think of them as closer opportunities (CLO). I use a relatively simple equation to consider how a reliever is used:

CLO = ((Save Opportunities / Games Played) + (Holds / Games Played) * (SQRT (0.5 + pLI))

The intent of this formula was based on two different things: 1) I wanted a variable that had a maximum value of 1 and this is what I have done with the first part of the equation for save opportunities and holds and 2) I wanted to directly factor in how that relief pitcher is used in terms of situations. That is how I decided to come up with what I term CLO. I added 0.5 to the pLI value of the square root because if a value is below 1, then it skews those data points. Before I look at my other variables, I wanted to show who the top 25 AL relief pitchers were in terms of CLO:

Rnk

First

Last

Tm

SV O

Hld

CLO

1

Joe

Borowski

CLE

53

0

1.292

2

Francisco

Rodriguez

LAA

46

0

1.159

3

Todd

Jones

DET

44

0

1.152

4

Bobby

Jenks

CHA

46

0

1.080

5

Jonathan R

Papelbon

BOS

40

2

1.058

6

Joe

Nathan

MIN

41

0

0.953

7

Jeremy

Accardo

TOR

35

2

0.918

8

J.J.

Putz

SEA

42

0

0.910

9

Scot

Shields

LAA

8

31

0.790

10

Rafael

Betancourt

CLE

6

31

0.785

11

Huston L

Street

OAK

21

5

0.781

12

Akinori

Otsuka

TEX

7

11

0.776

13

Eric

Gagne

TEX

17

1

0.774

14

Alan

Embree

OAK

21

16

0.771

15

Mariano

Rivera

NYA

34

0

0.763

16

Joakim A

Soria

KC

21

9

0.731

17

Hideki

Okajima

BOS

7

27

0.729

18

Al

Reyes

TB

30

0

0.724

19

Casey C

Janssen

TOR

11

24

0.721

20

Chris

Ray

BAL

20

0

0.721

21

Joel M

Zumaya

DET

5

8

0.693

22

Joaquin

Benoit

TEX

13

19

0.650

23

Justin

Speier

LAA

1

24

0.622

24

C.J.

Wilson

TEX

14

15

0.588

25

Jamie

Walker

BAL

13

21

0.571

One of the first things that jumps out at me is, is that yes, the relief pitchers with the most save opportunities did occupy the top 8 spots as expected; however, beyond that point, the order changed a little bit. Scot Shields and Rafael Betancourt had more opportunities to ‘close’ a game than illuminaries like Mariano Rivera, Eric Gagne and Huston Street. Since this is rate based, it does not matter if a player was injured in terms of volume. However, it does show that the kiddy gloves were used for Eric Gagne when he was in Texas and for Huston Street. Now, it may not seem that these values in and of itself does not seem very enlightening. I plan on changing that thought because what I am angling towards is, is that a relief pitcher may seem more effective because of the volume of saves or because one’s ERA is well-below league average and I am aiming to further dispel this mirage.

I won’t lie; a large part of the reason to put forth this effort is because I participate in fantasy sports. And so now I shall take a several other stats from THT: Runs Created per Innings Pitched, K's per game, BABIP, FIP and K/BB. The reason why I look at these is because they indicate to me performance and if their efforts (i.e. numbers) are sustainable. Honestly, I use these all in an equation to calculate a RP value for my auction fantasy leagues, which is an NL-only league. So, without further rambling, here is my equation:


RP Auction Fantasy Value = SQRT (2.5 * CLO * RC/IP * K/G * (1 + (0.700 – DER)) * K/BB * (9 – FIP))

I use 2.5 as my normalizing constant simply by choice and without any basis other than I liked the way those values shaped up. I typically like to look at 3-year trends to do fantasy values, but I feel that that can get thrown right out when looking at closers because RP just are too fickle. So I tend to use just the previous year to evaluate them for the next. Plus, it depends on the manager and if they change or if a player gets traded. Before I present my top 20 values for AL RP, I would like to add a couple of more comments based upon unique pitching events that occurred on average from 2000 – 2007.

1) 4.57 pitchers per AL team accrued a save,

2) 8.36 pitchers per AL team accrued a hold,

3) The average NL team had 1.04 more pitchers accrue a hold than their AL counterparts and,

4) The average AL team had 0.15 more pitchers accrue a save than their NL counterparts.

This tells me a couple of things:

1) One player does not accrue all of the saves for a team; on average, more than 4 players do. This means that one should, obviously, never consider a RP to be ‘their’ closer for the whole year,

2) There are twice as many pitchers that get a hold per team than a save. This means that there a lot of pitchers that run into the opportunity to demonstrate their effectiveness and therefore those that are successful COULD run into the opportunity to pick up a save here and there. This further strengthens the notion that a standard roto fantasy baseball owner should strongly consider obtaining a high-quality relief pitcher before they ruin their other stats for a SP who will likely gather more wins to their credit (exhibits A and B are: Livan Hernandez and Matt Morris) because these RP will probably garner a comparable strikeout total.

3) NL teams use more RP than AL teams do. This indicates that the RP of an NL team is not as likely to be given the opportunity to close out a game if they are not the designated closer. This makes sense as NL pitchers are pulled out frequently for pinch-hitters, but this may not be the only root cause.

4) It is difficult to know if that if there is any real rationale behind AL teams averaging more unique pitchers with a save than NL teams, but in the very least it does help solidify the concept that every fantasy owner (and real-life GM) should expect a team will have 4+ pitchers who ‘save’ a game and therefore fantasy owners should take a chance on that Kevin Gregg this year.

So, with these new equation in tow, I am presenting to you the top 25 fantasy values in dollars for AL RP is thus:

Rnk

First

Last

Tm

F.V.

1

Jonathan R

Papelbon

BOS

35.9

2

Rafael

Betancourt

CLE

35.7

3

J.J.

Putz

SEA

33.4

4

Joba L

Chamberlain

NYA

31.5

5

Joe

Nathan

MIN

24.1

6

Mariano

Rivera

NYA

21.3

7

Huston L

Street

OAK

21.0

8

Bobby

Jenks

CHA

20.8

9

Francisco

Rodriguez

LAA

20.1

10

Joakim A

Soria

KC

18.8

11

Hideki

Okajima

BOS

16.4

12

George F

Sherrill

SEA

15.0

13

Joaquin

Benoit

TEX

13.9

14

Rafael E

Perez

CLE

13.6

15

Justin

Speier

LAA

12.9

16

Eric

Gagne

TEX

13.1

17

Jeremy

Accardo

TOR

12.8

18

Joe

Borowski

CLE

12.7

19

Akinori

Otsuka

TEX

11.5

20

Scott

Downs

TOR

10.8

21

Scot

Shields

LAA

10.0

22

Al

Reyes

TB

9.9

23

Alan

Embree

OAK

9.6

24

Chris

Ray

BAL

9.4

25

Manny

Delcarmen

BOS

8.6

I chose to look at the top 25 because I thought that it was interesting that while Scot Shields was ranked very high on obtaining closer opportunities; it is not justified. He tumbles 12 spots to #21. Although Justin Speier had an opportunity in Toronto before, he appears to be a better candidate now than Shields. Although it is obvious, it is also quite interesting that there seems to be two RP that would be better candidates to close a game out than Cleveland’s Joe Borowski in Rafael Betancourt and Rafael Perez. I did not institute a minimum amount of playing time and that is why Joba Chamberlain appears so highly rated on the list. It is not that he did not deserve it in his limited playing time, but it is just that: limited, but it also may mean that Joba may stay in the bullpen permanently as the heir apparent to Rivera. Joaquin Benoit seems to be the better choice for ‘closer’ when compared to CJ Wilson. This list helps reinforce the notion that just because a RP has the closer title; it does not mean that he has earned it through his accomplishments. As similarly to Borowski, Todd Jones falls from #3 on the CLO list to #38 in RP value as their 5.07 and 4.26 ERA, respectively, shows.

So, in summary, I would strongly consider picking up the following ‘set-up men’ for the AL:

1) Rafael Betancourt

2) Joaquin Benoit

3) Al Reyes

4) Hideki Okajima

5) Scot Shields

I would comment that George Sherrill appears to have the inside lane to be the closer in Baltimore this year; however, in that, the appropriate hand-cuff is likely to be Chad Bradford per last year’s data even though Jamie Walker had more opportunity because Sherrill is also a LHP.

One last side comment, do not expect for Troy Percival to be anywhere as successful as he was last year. This may seem obvious, but he was rarely used in any high leverage situations, let alone with a save opportunity or hold. This also does not factor in that he is going from the much weaker NL Central to the AL East. The 3-year park effect seems to be fairly neutral. My recommendation of Al Reyes is not so much that he is a strong RP, but it is more so that he has been entrusted this role prior and Percival is not likely to repeat his performance of 2007.

Anyway, I hope anyone who might read this enjoyed my 2nd post. Next, um, I am actually going to support these numbers by going back in time a few years and look at all of the change in closers in combination with maybe… just maybe doing some regression analysis. See ya.

Tuesday, February 12, 2008

A Different Slant to Pitch F/x

Well, a dude has to start somewhere. First, I have to say thank you to Josh Kalk because I am using his data cards from his website, er, it's also one of the links to the right, directly for the averages and percentages of each pitcher. I do not know how to download the data because my software skills are non-existent other than Excel. I am also using his league averages from his pitches from his work at The Hard Times.

Second, so I have seen a lot of analysis that is well beyond my capabilities in regards to Pitch F/x, but I have not found what I consider basic, yet very interesting. What I have been looking for is to see why a pitcher's pitch is effective and can it be sourced via Pitch F/x. Primarily, this post is going to look at a select set of pitchers based off of two things: 1) Baseball Prospectus' (and Nate Silver's) 2008 Pecota WXRL Values and 2) a few select outliers due to my interest in they for the purpose of this post. What I am specifically going to be looking at here is the second highest, according to percentage, pitched pitch for this set of pitchers and compare this value to the league average, their own Fastball, and then the difference between the league average Fastball and that particular primary non-Fastball pitch. I will be looking at the speed of the pitch, the movement in the x and z coordinates.

My biggest limitation in all of this (besides my poor mathematical skills) is that I am primarily using the Fastball as my reference for effective pitching, but in that I am using data that does not differentiate between a two-seamer and a four-seamer. Also, I have eliminated Cutters and Sinkers as the primary reference source... sorry about all of that, but like I said, a dude has to start somewhere because the averages indicate Fastball first. So, I am in the validation business and I am not involved in any thing statistic heavy so to speak, but I like looking at least a 30 sample size when I do look at things, which is the preference for batching so and so forth.

In that, I am actually looking at 33 SP because 3 of them do not pitch a "fastball," but I did want to include them in the second part of what I was looking at, which was the primary non-Fastball pitch they throw (versus the league average). (FYI, the 3 pitchers I have referenced are Greg Maddux, AJ Burnett, and Roy Halladay.) I have also included 9 LHP of which I have similarly 'normalized' their x movement (in italics). Anyway, here are a list of the players that I included and what their 1st secondary pitch is according to their percent thrown:

List of Players

First 2˚ Pitches

Age (by 10/1/08)

Beckett, Josh

Curve

28

Bedard, Erik

Curve

29

Blanton, Joe

Sinker (Slider)

27

Bonderman, Jeremy

Slider

25

Burnett, A.J.

Sinker (Curve)

31

Carmona, Fausto

Sinker (Change)

24

Francis, Jeff

Change

27

Halladay, Roy

Sinker (Cutter)

31

Hamels, Cole

Change

24

Harang, Aaron

Slider

30

Haren, Dan

Curve + Change

28

Hernandez, Felix

Slider

22

Hill, Rich

Curve

28

Kazmir, Scott

Slider

24

Lackey, John

Curve

29

Maddux, Greg

Sinker (Change)

42

Matsuzaka, Daisuke

Slider

28

Oswalt, Roy

Slider + Curve

31

Peavy, Jake

Slider

27

Penny, Brad

Change

30

Pettite, Andy

Cutter

36

Sabathia, C.C.

Slider

28

Santana, Johan

Change

29

Sheets, Ben

Curve

30

Shields, James

Change

26

Smoltz, John

Slider

41

Vazquez, Javier

Slider

32

Verlander, Justin

Curve + Change

25

Wang, Chien-Ming

Slider

28

Webb, Brandon

Sinker (Slider)

29

Young, Chris R

Slider

29

Zambrano, Carlos

Cutter

27

Zito, Barry

Change

30

24 RHP, 9 LHP

14 SL, 10 Ch, 9 Cur, 3 Cut

28.9

As mentioned, I am not including sinkers in this snapshot, but they really should be considered going forward. I just have to gain the ability to do so. However, in terms of a quick breakdown of these pitchers, here they are (with some having virtually identical first secondary pitches):

14 Sliders

38.9%

10 Changes

27.8%

9 Curves

25.0%

3 Cutters

8.3%

Now, Sliders are the prevalent first secondary pitch. I really did not think of the age of the pitchers when I selected these players, other than Greg Maddux, which was more of an interest because of his control of pitch locations; however, between these three major secondary pitches, there is no average age difference (Average Slider age: 28.6, Average Change age: 28.5, Average Curve age: 28.8). I should say that 4 of the 9 LHP first secondary pitch were a change, 2 were a Curve, 2 were a Slider, and the other was a Cutter. This leaves the breakdown of the RHP being 12 Sliders, 7 Curves, 6 Changes, and 2 Cutters.

So for both LHP and RHP, the primary pitch to complement their Fastball is the Slider.

1st 2˚ Pitches

Type

Movement in x (in.)

Averages

Fastball

-5.61


vs. Lg

-0.15


Slider (14)

2.56


vs. Lg

0.26


vs. Fb

8.08


vs. LgFb

0.66


Change (10)

-5.79


vs. Lg

0.71

(Minus Maddux)

vs. Fb

-0.49


vs. LgFb

0.55


Curve (9)

5.30


vs. Lg

0.60

(Minus Burnett)

vs. Fb

10.79


vs. LgFb

0.63

1st 2˚ Pitches

Type

Movement in z (in.)

Averages

Fastball

10.78


vs. Lg

1.00


Slider (14)

2.98


vs. Lg

0.48


vs. Fb

-7.45


vs. LgFb

-0.60


Change (10)

6.06


vs. Lg

0.26

(Minus Maddux)

vs. Fb

-5.57


vs. LgFb

-1.59


Curve (9)

-3.89


vs. Lg

0.51

(Minus Burnett)

vs. Fb

-14.12


vs. LgFb

0.06

1st 2˚ Pitches

Type

Initial Speed (MPH)

Averages

Fastball

92.76


vs. Lg

0.96


Slider (14)

83.84


vs. Lg

0.54


vs. Fb

-9.41


vs. LgFb

-1.24


Change (10)

82.45


vs. Lg

-0.05

(Minus Maddux)

vs. Fb

-9.07


vs. LgFb

0.23


Curve (9)

78.78


vs. Lg

1.48

(Minus Burnett)

vs. Fb

-15.09


vs. LgFb

-0.59

I guess my comments would be the following:

  • The ~1 mph faster fastball allows for all other pitches to be more effective so long as the secondary pitches are at least league average because their difference is greater by default of a faster fastball.
  • A league average slider by default becomes more effective for this set likely because of the greater speed of this set's Fastball. In addition, it is possible that the movement difference down and in to the hitter is significant further adding to this set's effectiveness.
  • In regards to this set's Changes, there is a significant downward difference in movement when compared to their fastball, when compared to the league average difference. This would seem to be the primary source of effectiveness, but more investigation is clearly needed.
  • Finally, when looking at this set's Curves, well, it is not very clear. Maybe the difference in speed and movement towards the hitter is relative, but it may not be. As is obvious, more investigation would be needed.

I hope at least some of this makes sense. The data clearly is biased, so I do know if this can even be taken with a grain of salt, nonetheless it would be interesting if it is poignant. And again, my exclusion of sinkers could really be throwing all of this data off in addition to not having definitive 2-seam and 4-seam differentiations...

I know one of the things that I should do is look at correlations between these differences and a pitcher's effectiveness, but I haven't gotten that far and I haven't really thought which stat I wanted to use for this basis. And I know some other people have looked at pitching sequences, but I think that is skipping a step... anyway... thanks for humoring me in my first real post... I think...

Intro / Disclaimer

So, one of my primary hobbies is baseball. And in that, I have decided to start a blog about my interest, well, because I can and because I can control what goes on versus a fan blog, so on and so forth.

DISCLAIMER: I just wanted to make a disclaimer (that's right, after the big word disclaimer, I am saying it again), and that is that I am a fan, enthusiast, and fantasy player of baseball and I just have a regular ol' Masters in Chemistry, so I don't do heavy duty mathematical equations and such; I am just going to write about my perspective, okay? Okay... I am glad I got the disclaimer out of the way... sheesh.