title: Counting on Poisson Regression with Scikit-learn use_katex: True class: title-slide # Counting on Poisson Regression with data:image/s3,"s3://crabby-images/61236/61236c366ef808f76b6becbd8a8bd391e1ef4077" alt=":scale 25%" .larger[Thomas J. Fan]
@thomasjpfan
This talk on Github: thomasjpfan/scipy-2022-poisson
--- class: center # Agriculture data:image/s3,"s3://crabby-images/f62b0/f62b03ad0488bc15bb89068c70f075ac60b71fb1" alt=":scale 80%" --- class: center # Risk modeling data:image/s3,"s3://crabby-images/0cba7/0cba7fad1bb13246437a79263b32a5d54a700bc7" alt=":scale 80%" --- class: center # Predictive maintenance data:image/s3,"s3://crabby-images/d3096/d3096bf03ca4ae12f488c02592183289c4612618" alt=":scale 80%" --- class: top # Poisson Regression! ## PoissonRegressor ```python from sklearn.linear_model import PoissonRegressor reg = PoissonRegressor() ``` -- ## RandomForestRegressor ```python from sklearn.ensemble import RandomForestRegressor reg = RandomForestRegressor(criterion="poisson") ``` -- ## HistGradientBoostingRegressor ```python from sklearn.ensemble import HistGradientBoostingRegressor reg = HistGradientBoostingRegressor(loss="poisson") ``` --- class: center # Bike Sharing Dataset 🚲 data:image/s3,"s3://crabby-images/dc576/dc576c34f18f20dd8a3f2b04bff1c09df118ea6f" alt=":scale 80%" --- class: center # Bike Sharing Dataset 🚲 data:image/s3,"s3://crabby-images/2817c/2817cadb15fbeefa852e330a86f31fcab7617358" alt="" --- class: center # Bike Sharing Dataset 🚲 data:image/s3,"s3://crabby-images/2c488/2c4887e54c1fb2154f0a714242a4c3cf46a52254" alt="" --- class: center # Bike Sharing Dataset 🚲 data:image/s3,"s3://crabby-images/b9a6c/b9a6c974735e67e9503d523fb53a9be7d4ed44da" alt="" --- # ColumnTransformer ```python preprocessor = ColumnTransformer([ ( "cyclic_hour", SplineTransformer(n_knots=13, extrapolation="periodic"), ["hour"] ), ( "categorical", OneHotEncoder(handle_unknown="ignore"), ["is_holiday", "weather_code", "is_weekend", "season"] ), ], remainder=MinMaxScaler()) ``` .center[ data:image/s3,"s3://crabby-images/e94e8/e94e818883fc4ba696b101a377529e257df1b5ce" alt=":scale 50%" ] --- # ColumnTransformer ```python preprocessor = ColumnTransformer([ ( "cyclic_hour", SplineTransformer(n_knots=13, extrapolation="periodic"), ["hour"] ), ( * "categorical", * OneHotEncoder(handle_unknown="ignore"), * ["is_holiday", "weather_code", "is_weekend", "season"] ), ], remainder=MinMaxScaler()) ``` .center[ data:image/s3,"s3://crabby-images/e94e8/e94e818883fc4ba696b101a377529e257df1b5ce" alt=":scale 50%" ] --- # ColumnTransformer ```python preprocessor = ColumnTransformer([ ( "cyclic_hour", SplineTransformer(n_knots=13, extrapolation="periodic"), ["hour"] ), ( "categorical", OneHotEncoder(handle_unknown="ignore"), ["is_holiday", "weather_code", "is_weekend", "season"] ), *], remainder=MinMaxScaler()) ``` .center[ data:image/s3,"s3://crabby-images/e94e8/e94e818883fc4ba696b101a377529e257df1b5ce" alt=":scale 50%" ] --- # ColumnTransformer ```python preprocessor = ColumnTransformer([ ( * "cyclic_hour", * SplineTransformer(n_knots=13, extrapolation="periodic"), * ["hour"] ), ( "categorical", OneHotEncoder(handle_unknown="ignore"), ["is_holiday", "weather_code", "is_weekend", "season"] ), ], remainder=MinMaxScaler()) ``` .center[ data:image/s3,"s3://crabby-images/e94e8/e94e818883fc4ba696b101a377529e257df1b5ce" alt=":scale 50%" ] --- # Periodic spline features .center[ data:image/s3,"s3://crabby-images/35111/35111082d08ee797632d45ab7ab0065acf23cf76" alt="" ] - [Time-related feature engineering example](https://scikit-learn.org/stable/auto_examples/applications/plot_cyclical_feature_engineering.html#sphx-glr-auto-examples-applications-plot-cyclical-feature-engineering-py) --- class: chapter-slide # PoissonRegressor 🎢 --- class: top
# Generalized Linear Models (GLM) $$ \hat{y}(w, X) = h(Xw) $$ where $\hat{y}$ is the predicted values, $X$ are features, and $h$ is the inverse link function. --
## Minimization problem becomes: $$ \min_{w}\frac{1}{2n} \sum_i d(y_i, \hat{y}_i) + \frac{\alpha}{2}||w||_2^2 $$ where $\alpha$ is the L2 regularization penalty. .footnote-back[ [User Guide](https://scikit-learn.org/stable/modules/linear_model.html#generalized-linear-regression) ] --- # Deviance data:image/s3,"s3://crabby-images/80e14/80e1457b496c682c744201bbfb605bceb6a46b5d" alt="" .footnote-back[ [User Guide](https://scikit-learn.org/stable/modules/linear_model.html#generalized-linear-regression) ] --- class: top # Minimization Problem `scipy.optimize.minimize` is used with `L-BFGS-B` $$ \min_{w}\frac{1}{2n} \sum_i d(y_i, \hat{y}_i) + \frac{\alpha}{2}||w||_2^2 $$ -- - Cholesky based Newton solver: [PR #23314](https://github.com/scikit-learn/scikit-learn/pull/23314) - Newton-LSMR: [PR #23507](https://github.com/scikit-learn/scikit-learn/pull/23507) --
## Regularization by Default! .center.larger[ `PoissonRegressor(alpha=1.0)` ] --- class: top # Preprocessor + Linear Model ## Poisson ```python poisson = make_pipeline(preprocessor, PoissonRegressor(...)) ``` .center[ data:image/s3,"s3://crabby-images/939f3/939f3d9c1e612862ee6d153553dac6a823bdfa82" alt=":scale 50%" ] -- ## Ridge ```python ridge = make_pipeline(preprocessor, Ridge()) ``` --- # Evaluation ```python from sklearn.model_selection import TimeSeriesSplit cv = TimeSeriesSplit( n_splits=50, max_train_size=10000, test_size=336, ) ``` --- # Results - Linear Models data:image/s3,"s3://crabby-images/72fcb/72fcb3760dc1f4733339063185648fadaad11406" alt=":scale 100%" --- # Distributions - Linear Models data:image/s3,"s3://crabby-images/10faf/10faf2b503c53666542b63f9ee49a4d5ea81825a" alt=":scale 100%" --- # Calibration for Regression .g.g-middle[ .g-8[ data:image/s3,"s3://crabby-images/39333/393332e620f596394c32c6ad17eee2027d5971b1" alt="" ] .g-4[ ] ] --- # Calibration for Regression .g.g-middle[ .g-8[ data:image/s3,"s3://crabby-images/3aa1a/3aa1ae2c92d8e0399249ff7c8934ac975d42e6fe" alt="" ] .g-4[ ] ] --- # Calibration for Regression .g.g-middle[ .g-8[ data:image/s3,"s3://crabby-images/29032/29032a804afe5ed5a4f10f8932c08e51f5dc7caf" alt="" ] .g-4[ ] ] --- # Calibration for Regression .g.g-middle[ .g-8[ data:image/s3,"s3://crabby-images/29032/29032a804afe5ed5a4f10f8932c08e51f5dc7caf" alt="" ] .g-4[ data:image/s3,"s3://crabby-images/0cd8c/0cd8c97420091376e7ad9c19523e659253b5a97f" alt="" ] ] --- # Calibration - Linear Models data:image/s3,"s3://crabby-images/c9cbb/c9cbbeb5f4d5d58e0a2af03ef38728a5232420a5" alt=":scale 100%" --- class: chapter-slide # Random Forest 🎄🎄🎄 --- # Random Forest With Poisson 🎄🎄🎄 .larger[ **`RandomForestRegressor(criterion="poisson")`** ] --- # How does Poisson Influence the Random Forest? $$ \text{H}(Q) = \dfrac{1}{n}\sum_{i\in Q} y_i * \log\left(\dfrac{y_i}{\hat{y}_i}\right) + \hat{y}_i - y_i $$ .g.g-middle[ .g-6[ where - $y_i$ is the true value - $\hat{y}_i$ is the predicted value - $Q$ is a node, - $n$ is the number of data points in node. ] .g-6[ data:image/s3,"s3://crabby-images/423ae/423aeb7ab9c093686c9b06373abf7c2fe5e7840f" alt="" ] ] .footnote-back[ [Details in User Guide](https://scikit-learn.org/stable/modules/tree.html#mathematical-formulation) ] --- # Random Forest .center[ data:image/s3,"s3://crabby-images/d80c7/d80c7796396d99cc551527b9aa650f0f0cef77fa" alt=":scale 60%" ] --- # Results - Random Forest data:image/s3,"s3://crabby-images/1f06e/1f06e10c8e91120ff3fe44407caf9f4bd5ae1b3a" alt=":scale 100%" --- # Distributions - Random Forest data:image/s3,"s3://crabby-images/b9c9b/b9c9b3f4193cbe626cccb4b33eb8e1ad3d39d9f6" alt=":scale 100%" --- # Calibration - Random Forest data:image/s3,"s3://crabby-images/caa38/caa38d29693e8693a47ba0c460b5df3e4e7457bb" alt=":scale 100%" --- class: chapter-slide # Histogram-based Gradient Boosting Trees 🏂 --- # HistGradientBoostingRegressor With Poisson 🏂 .larger[ **`HistGradientBoostingRegressor(loss="poisson")`** ] --- class: top # HistGradientBoostingRegressor Overview 🏂 ## Initial Condition $$ \hat{f}^{(0)} = C $$ where $C$ is the maximum likelihood estimate for **loss**. -- ## Iterations $$ \hat{f}^{(t)}=\hat{f}^{(t-1)} + \nu \hat{h}^{(t)} $$ where - $\hat{f}^{(t)}$ is the regressor at iteration $t$ - $\nu$ are learning rate - $\\hat{h}^{(t)}$ are trees using the **gradient** and **hessians**. --- class: top
# How does Poisson Influence the Algorithm? ## Growing Trees $\hat{h}^{(t)}$ by to evaluating splits: .g[ .g-8[ $$ \text{Gain} = \frac{1}{2}\left[\dfrac{G_L^2}{H_L+\lambda} + \dfrac{G_R^2}{H_R+\lambda} - \frac{(G_L+G_R)^2}{H_L+H_R+\lambda}\right] $$ where - $\lambda$ is the l2 regularization - $G_L$ and $G_R$ is sum of the **gradients** - $H_L$ and $H_R$ is sum of the **hessians** ] .g-4[ data:image/s3,"s3://crabby-images/3e8b2/3e8b25f42c249b86ee320172095e38a27b4dc60a" alt="" ] ] -- ## More Details @ [SciPy 2019 Fast Gradient Boosting Decision Trees with PyGBM and Numba](https://www.youtube.com/watch?v=cLpIh8Aiy2w) --- class: top
# Linking $f$ with $y$ $$ \hat{y}^{(t)} = h(\hat{f}^{(t)}) $$ where $h$ is the inverse link function. -- ## Poisson's Inverse Link function $$ h(z) = \exp(z) $$ -- ## Looks link the GLMs $$ \hat{y}(w, X) = h(Xw) $$ --- # Results - Hist Gradient Boosting data:image/s3,"s3://crabby-images/ee022/ee022665e3150d5cfee4639f0b8457737d4a395d" alt=":scale 100%" --- # Results - Hist Gradient Boosting 🔎 data:image/s3,"s3://crabby-images/ded71/ded71c00cec9b8c8a3c5bbdab3a24c62a9addb96" alt=":scale 100%" --- # Distributions - Hist Gradient Boosting data:image/s3,"s3://crabby-images/62b29/62b299beb7e67a5bb88c492e669653f36b33cdf3" alt=":scale 100%" --- # Calibration - Hist Gradient Boosting data:image/s3,"s3://crabby-images/f57ae/f57aef53fc6b0848c94cd19295c9ff6b53ce2e8f" alt=":scale 100%" --- # Example of Predictions data:image/s3,"s3://crabby-images/7129d/7129d66886f4a7a2c7554e12c933bb7cd0105e33" alt=":scale 100%" --- # Poisson Regression with Bike Share Data 🚲🚲🚲 ## `PoissonRegressor()` ## `RandomForestRegressor(criterion="poisson")` ## `HistGradientBoostingRegressor(loss="poisson")` --- class: chapter-slide # Two More Topics 🔎 --- # Zero-Inflated Poisson Regression data:image/s3,"s3://crabby-images/a0aa6/a0aa621717f2d8f7a3438f5da15b913f50c1168b" alt=":scale 100%" --- # Zero-Inflated Poisson Regression ## Scikit-lego! .larger[ ```python from sklego.meta import ZeroInflatedRegressor poisson_zero = ZeroInflatedRegressor( * classifier=HistGradientBoostingClassifier(), * regressor=HistGradientBoostingRegressor(loss="poisson"), ) ``` ] --- # Exposure data:image/s3,"s3://crabby-images/3be9d/3be9d4498502c56ff56ba3b307817f5f6e323d93" alt="" --- # Exposure ```python df["Frequency"] = df["ClaimNb"] / df["Exposure"] poisson_gbrt.fit( df_train, df_train["Frequency"], * regressor__sample_weight=df_train["Exposure"], ) ``` [Poisson regression and non-normal loss example](https://scikit-learn.org/stable/auto_examples/linear_model/plot_poisson_regression_non_normal_loss.html#sphx-glr-auto-examples-linear-model-plot-poisson-regression-non-normal-loss-py) --- class: title-slide, left .center[ # Counting on Poisson Regression data:image/s3,"s3://crabby-images/61236/61236c366ef808f76b6becbd8a8bd391e1ef4077" alt=":scale 20%" ] .g[ .g-9[ #### `PoissonRegressor()` #### `RandomForestRegressor(criterion="poisson")` #### `HistGradientBoostingRegressor(loss="poisson")` ] .g-3.center[ Thomas J. Fan
@thomasjpfan
] ] .center[
This workshop on Github: github.com/thomasjpfan/scipy-2022-poisson
]