9.1. Polars Tips & Tricks#
9.1.1. Plugin for Data Science Functions#
Polars gains increasing popularity.
If you already ditched pandas for it, you don’t have to rewrite all of your functions in Polars again.
polars-ds
, a community plugin, has reimplemented common functions for Data Scientists like:
Statistical tests (t-test, …),
String similarities (Levenshtein, …)
Loss Functions and metrics (ROC, R2, L1, Huber, …)
!pip install polars_ds
import polars_ds
import polars as pl
df = pl.DataFrame(...)
# Calculate Loss and Metrics
df.group_by("dummy_groups").agg(
pl.col("actual").num_ext.l2_loss(pl.col("predicted")).alias("l2"),
pl.col("actual").num_ext.bce(pl.col("predicted")).alias("log loss"),
pl.col("actual").num_ext.binary_metrics_combo(pl.col("predicted")).alias("combo")
).unnest("combo")
9.1.2. Plugin for Fitting Linear Models#
In Polars, you can fit linear models with the polars-ols
extension.
You can use ordinary, weighted or regularized least squares like Lasso or Elastic Net.
It can be 2x-88x times faster than popular libraries like sklearn or statsmodels.
!pip install polars-ols
import polars as pl
import polars_ols as pls
lasso_expr = pl.col("y").least_squares.lasso("x1", "x2", alpha=0.0001, add_intercept=True).over("group")
predictions = df.with_columns(lasso_expr.round(2).alias("predictions_lasso"))