+ - 0:00:00
Notes for current slide
Notes for next slide

Counting on Poisson Regression with

Thomas J. Fan
@thomasjpfan
This talk on Github: thomasjpfan/scipy-2022-poisson

Agriculture

Risk modeling

Predictive maintenance

Poisson Regression!

PoissonRegressor

from sklearn.linear_model import PoissonRegressor
reg = PoissonRegressor()

Poisson Regression!

PoissonRegressor

from sklearn.linear_model import PoissonRegressor
reg = PoissonRegressor()

RandomForestRegressor

from sklearn.ensemble import RandomForestRegressor
reg = RandomForestRegressor(criterion="poisson")

Poisson Regression!

PoissonRegressor

from sklearn.linear_model import PoissonRegressor
reg = PoissonRegressor()

RandomForestRegressor

from sklearn.ensemble import RandomForestRegressor
reg = RandomForestRegressor(criterion="poisson")

HistGradientBoostingRegressor

from sklearn.ensemble import HistGradientBoostingRegressor
reg = HistGradientBoostingRegressor(loss="poisson")

Bike Sharing Dataset 🚲

Bike Sharing Dataset 🚲

Bike Sharing Dataset 🚲

Bike Sharing Dataset 🚲

ColumnTransformer

preprocessor = ColumnTransformer([
(
"cyclic_hour",
SplineTransformer(n_knots=13, extrapolation="periodic"),
["hour"]
),
(
"categorical",
OneHotEncoder(handle_unknown="ignore"),
["is_holiday", "weather_code", "is_weekend", "season"]
),
], remainder=MinMaxScaler())

ColumnTransformer

preprocessor = ColumnTransformer([
(
"cyclic_hour",
SplineTransformer(n_knots=13, extrapolation="periodic"),
["hour"]
),
(
"categorical",
OneHotEncoder(handle_unknown="ignore"),
["is_holiday", "weather_code", "is_weekend", "season"]
),
], remainder=MinMaxScaler())

ColumnTransformer

preprocessor = ColumnTransformer([
(
"cyclic_hour",
SplineTransformer(n_knots=13, extrapolation="periodic"),
["hour"]
),
(
"categorical",
OneHotEncoder(handle_unknown="ignore"),
["is_holiday", "weather_code", "is_weekend", "season"]
),
], remainder=MinMaxScaler())

ColumnTransformer

preprocessor = ColumnTransformer([
(
"cyclic_hour",
SplineTransformer(n_knots=13, extrapolation="periodic"),
["hour"]
),
(
"categorical",
OneHotEncoder(handle_unknown="ignore"),
["is_holiday", "weather_code", "is_weekend", "season"]
),
], remainder=MinMaxScaler())

PoissonRegressor 🎢



Generalized Linear Models (GLM)

y^(w,X)=h(Xw) \hat{y}(w, X) = h(Xw)

where y^\hat{y} is the predicted values, XX are features, and hh is the inverse link function.



Generalized Linear Models (GLM)

y^(w,X)=h(Xw) \hat{y}(w, X) = h(Xw)

where y^\hat{y} is the predicted values, XX are features, and hh is the inverse link function.


Minimization problem becomes:

minw12nid(yi,y^i)+α2w22 \min_{w}\frac{1}{2n} \sum_i d(y_i, \hat{y}_i) + \frac{\alpha}{2}||w||_2^2

where α\alpha is the L2 regularization penalty.

Deviance

Minimization Problem

scipy.optimize.minimize is used with L-BFGS-B

minw12nid(yi,y^i)+α2w22 \min_{w}\frac{1}{2n} \sum_i d(y_i, \hat{y}_i) + \frac{\alpha}{2}||w||_2^2

Minimization Problem

scipy.optimize.minimize is used with L-BFGS-B

minw12nid(yi,y^i)+α2w22 \min_{w}\frac{1}{2n} \sum_i d(y_i, \hat{y}_i) + \frac{\alpha}{2}||w||_2^2

Minimization Problem

scipy.optimize.minimize is used with L-BFGS-B

minw12nid(yi,y^i)+α2w22 \min_{w}\frac{1}{2n} \sum_i d(y_i, \hat{y}_i) + \frac{\alpha}{2}||w||_2^2


Regularization by Default!

PoissonRegressor(alpha=1.0)

Preprocessor + Linear Model

Poisson

poisson = make_pipeline(preprocessor, PoissonRegressor(...))

Preprocessor + Linear Model

Poisson

poisson = make_pipeline(preprocessor, PoissonRegressor(...))

Ridge

ridge = make_pipeline(preprocessor, Ridge())

Evaluation

from sklearn.model_selection import TimeSeriesSplit
cv = TimeSeriesSplit(
n_splits=50,
max_train_size=10000,
test_size=336,
)

Results - Linear Models

Distributions - Linear Models

Calibration for Regression

Calibration for Regression

Calibration for Regression

Calibration for Regression

Calibration - Linear Models

Random Forest 🎄🎄🎄

Random Forest With Poisson 🎄🎄🎄

RandomForestRegressor(criterion="poisson")

How does Poisson Influence the Random Forest?

H(Q)=1niQyilog(yiy^i)+y^iyi \text{H}(Q) = \dfrac{1}{n}\sum_{i\in Q} y_i * \log\left(\dfrac{y_i}{\hat{y}_i}\right) + \hat{y}_i - y_i

where

  • yiy_i is the true value
  • y^i\hat{y}_i is the predicted value
  • QQ is a node,
  • nn is the number of data points in node.

Random Forest

Results - Random Forest

Distributions - Random Forest

Calibration - Random Forest

Histogram-based Gradient Boosting Trees 🏂

HistGradientBoostingRegressor With Poisson 🏂

HistGradientBoostingRegressor(loss="poisson")

HistGradientBoostingRegressor Overview 🏂

Initial Condition

f^(0)=C \hat{f}^{(0)} = C

where CC is the maximum likelihood estimate for loss.

HistGradientBoostingRegressor Overview 🏂

Initial Condition

f^(0)=C \hat{f}^{(0)} = C

where CC is the maximum likelihood estimate for loss.

Iterations

f^(t)=f^(t1)+νh^(t) \hat{f}^{(t)}=\hat{f}^{(t-1)} + \nu \hat{h}^{(t)}

where

  • f^(t)\hat{f}^{(t)} is the regressor at iteration tt
  • ν\nu are learning rate
  • h^(t)\hat{h}^{(t)} are trees using the gradient and hessians.


How does Poisson Influence the Algorithm?

Growing Trees h^(t)\hat{h}^{(t)} by to evaluating splits:

Gain=12[GL2HL+λ+GR2HR+λ(GL+GR)2HL+HR+λ] \text{Gain} = \frac{1}{2}\left[\dfrac{G_L^2}{H_L+\lambda} + \dfrac{G_R^2}{H_R+\lambda} - \frac{(G_L+G_R)^2}{H_L+H_R+\lambda}\right]

where

  • λ\lambda is the l2 regularization
  • GLG_L and GRG_R is sum of the gradients
  • HLH_L and HRH_R is sum of the hessians


How does Poisson Influence the Algorithm?

Growing Trees h^(t)\hat{h}^{(t)} by to evaluating splits:

Gain=12[GL2HL+λ+GR2HR+λ(GL+GR)2HL+HR+λ] \text{Gain} = \frac{1}{2}\left[\dfrac{G_L^2}{H_L+\lambda} + \dfrac{G_R^2}{H_R+\lambda} - \frac{(G_L+G_R)^2}{H_L+H_R+\lambda}\right]

where

  • λ\lambda is the l2 regularization
  • GLG_L and GRG_R is sum of the gradients
  • HLH_L and HRH_R is sum of the hessians

More Details @

SciPy 2019 Fast Gradient Boosting Decision Trees with PyGBM and Numba


Linking ff with yy

y^(t)=h(f^(t)) \hat{y}^{(t)} = h(\hat{f}^{(t)})

where hh is the inverse link function.


Linking ff with yy

y^(t)=h(f^(t)) \hat{y}^{(t)} = h(\hat{f}^{(t)})

where hh is the inverse link function.

h(z)=exp(z) h(z) = \exp(z)


Linking ff with yy

y^(t)=h(f^(t)) \hat{y}^{(t)} = h(\hat{f}^{(t)})

where hh is the inverse link function.

h(z)=exp(z) h(z) = \exp(z)

y^(w,X)=h(Xw) \hat{y}(w, X) = h(Xw)

Results - Hist Gradient Boosting

Results - Hist Gradient Boosting 🔎

Distributions - Hist Gradient Boosting

Calibration - Hist Gradient Boosting

Example of Predictions

Poisson Regression with Bike Share Data 🚲🚲🚲

PoissonRegressor()

RandomForestRegressor(criterion="poisson")

HistGradientBoostingRegressor(loss="poisson")

Two More Topics 🔎

Zero-Inflated Poisson Regression

Zero-Inflated Poisson Regression

Scikit-lego!

from sklego.meta import ZeroInflatedRegressor
poisson_zero = ZeroInflatedRegressor(
classifier=HistGradientBoostingClassifier(),
regressor=HistGradientBoostingRegressor(loss="poisson"),
)

Exposure

Exposure

df["Frequency"] = df["ClaimNb"] / df["Exposure"]
poisson_gbrt.fit(
df_train, df_train["Frequency"],
regressor__sample_weight=df_train["Exposure"],
)

Poisson regression and non-normal loss example

Counting on Poisson Regression

PoissonRegressor()

RandomForestRegressor(criterion="poisson")

HistGradientBoostingRegressor(loss="poisson")

Thomas J. Fan
@thomasjpfan

Agriculture

Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow