.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/finetune_regressor.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_tutorials_finetune_regressor.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_finetune_regressor.py:


Fine-tuning TabICL for regression
=================================

Adapt a pretrained TabICL regressor to a single dataset with
:class:`tabicl.FinetunedTabICLRegressor` (pinball loss on raw quantiles,
same objective the pretrained head was fit with).

.. note::

    A CUDA GPU is recommended for large-scale fine-tuning. Multi-GPU via
    ``torchrun --nproc-per-node=N`` (auto-detected).

.. GENERATED FROM PYTHON SOURCE LINES 16-26

.. code-block:: Python

    import os

    import matplotlib.pyplot as plt
    import numpy as np
    from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
    from sklearn.model_selection import train_test_split

    from tabicl import FinetunedTabICLRegressor, TabICLRegressor


.. GENERATED FROM PYTHON SOURCE LINES 27-29

Target: one easy feature (sine), one hard feature (spike)
---------------------------------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 29-58

.. code-block:: Python

    def target_fn(x: np.ndarray) -> np.ndarray:
        return 0.8 * np.sin(1.2 * x) + 2.5 * np.exp(-80.0 * (x - 1.0) ** 2)


    def make_dataset(n_samples: int = 1_000, random_state: int = 0):
        rng = np.random.RandomState(random_state)
        x = rng.uniform(-3.0, 3.0, size=n_samples)
        y = target_fn(x) + rng.normal(0.0, 0.08, size=n_samples)
        X = x.reshape(-1, 1).astype(np.float32)
        return X, y.astype(np.float32)


    X, y = make_dataset(n_samples=1000, random_state=0)

    # Split: 40 train (sparse at the spike) / 200 val (early stopping) / 760 test.
    X_train, X_rest, y_train, y_rest = train_test_split(X, y, train_size=40, random_state=0)
    X_val, X_test, y_val, y_test = train_test_split(X_rest, y_rest, train_size=200, random_state=0)

    is_main_process = int(os.environ.get("LOCAL_RANK", "0")) == 0


    def _metrics(pred: np.ndarray, y_true: np.ndarray) -> tuple[float, float, float]:
        return (
            float(mean_squared_error(y_true, pred)),
            float(mean_absolute_error(y_true, pred)),
            float(r2_score(y_true, pred)),
        )


.. GENERATED FROM PYTHON SOURCE LINES 59-63

Baseline — zero-shot TabICL
---------------------------

Expected: draws the sine, smears the spike.

.. GENERATED FROM PYTHON SOURCE LINES 63-71

.. code-block:: Python

    base = TabICLRegressor(n_estimators=4, random_state=0)
    base.fit(X_train, y_train)
    base_pred = base.predict(X_test)
    base_mse, base_mae, base_r2 = _metrics(base_pred, y_test)
    # Captured for the training-curve reference line in Figure 2.
    base_val_mse = float(mean_squared_error(y_val, base.predict(X_val)))


.. GENERATED FROM PYTHON SOURCE LINES 72-78

Fine-tune
---------

``_HistoryLogger`` below is installed via the same
``_make_experiment_logger`` hook ``wandb_kwargs`` uses, to capture per-epoch
val metrics for Figure 2 without pulling in W&B.

.. GENERATED FROM PYTHON SOURCE LINES 78-122

.. code-block:: Python

    history: dict[str, list[float]] = {
        "epoch": [],
        "val_mse": [],
        "val_mae": [],
        "val_r2": [],
        "train_loss": [],
    }


    class _HistoryLogger:
        """Record per-epoch validation metrics into ``history``."""

        def setup(self, config):
            del config

        def log_step(self, metrics, step):
            del metrics, step

        def log_epoch(self, metrics, step):
            del step
            history["epoch"].append(int(metrics.get("train/epoch", len(history["epoch"]))) + 1)
            history["val_mse"].append(float(metrics.get("val/mse", np.nan)))
            history["val_mae"].append(float(metrics.get("val/mae", np.nan)))
            history["val_r2"].append(float(metrics.get("val/r2", np.nan)))
            history["train_loss"].append(float(metrics.get("train/mean_loss", np.nan)))

        def finish(self):
            pass


    reg = FinetunedTabICLRegressor(
        epochs=60,
        learning_rate=1e-5,
        n_estimators_finetune=2,
        n_estimators_validation=2,
        n_estimators_inference=4,
        early_stopping=True,
        patience=10,
        random_state=0,
        verbose=True,
    )
    reg._make_experiment_logger = lambda: _HistoryLogger()
    reg.fit(X_train, y_train, X_val=X_val, y_val=y_val)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    /home/docs/checkouts/readthedocs.org/user_builds/tabicl/checkouts/stable/tutorials/finetune_regressor.py:120: UserWarning: `output_dir` is not set; no checkpoints will be saved and all fine-tuning progress is lost if the run is interrupted.
      reg.fit(X_train, y_train, X_val=X_val, y_val=y_val)
    Baseline val mse: -0.0586
    Fine-tune:   0%|          | 0/60 [00:00<?, ?it/s]    Fine-tune:   0%|          | 0/60 [00:00<?, ?it/s, train_loss=0.0817, val_mse=-0.0586, best=-0.0586, s/epoch=0.4]    Fine-tune:   2%|▏         | 1/60 [00:00<00:37,  1.58it/s, train_loss=0.0817, val_mse=-0.0586, best=-0.0586, s/epoch=0.4]    Fine-tune:   2%|▏         | 1/60 [00:01<00:37,  1.58it/s, train_loss=0.1725, val_mse=-0.0588, best=-0.0586, s/epoch=0.3]    Fine-tune:   3%|▎         | 2/60 [00:01<00:31,  1.82it/s, train_loss=0.1725, val_mse=-0.0588, best=-0.0586, s/epoch=0.3]    Fine-tune:   3%|▎         | 2/60 [00:01<00:31,  1.82it/s, train_loss=0.0530, val_mse=-0.0591, best=-0.0586, s/epoch=0.3]    Fine-tune:   5%|▌         | 3/60 [00:01<00:29,  1.94it/s, train_loss=0.0530, val_mse=-0.0591, best=-0.0586, s/epoch=0.3]    Fine-tune:   5%|▌         | 3/60 [00:02<00:29,  1.94it/s, train_loss=0.0999, val_mse=-0.0596, best=-0.0586, s/epoch=0.2]    Fine-tune:   7%|▋         | 4/60 [00:02<00:27,  2.02it/s, train_loss=0.0999, val_mse=-0.0596, best=-0.0586, s/epoch=0.2]    Fine-tune:   7%|▋         | 4/60 [00:02<00:27,  2.02it/s, train_loss=0.0910, val_mse=-0.0595, best=-0.0586, s/epoch=0.2]    Fine-tune:   8%|▊         | 5/60 [00:02<00:26,  2.06it/s, train_loss=0.0910, val_mse=-0.0595, best=-0.0586, s/epoch=0.2]    Fine-tune:   8%|▊         | 5/60 [00:03<00:26,  2.06it/s, train_loss=0.0802, val_mse=-0.0584, best=-0.0584, s/epoch=0.2]    Fine-tune:  10%|█         | 6/60 [00:03<00:26,  2.04it/s, train_loss=0.0802, val_mse=-0.0584, best=-0.0584, s/epoch=0.2]    Fine-tune:  10%|█         | 6/60 [00:03<00:26,  2.04it/s, train_loss=0.0291, val_mse=-0.0573, best=-0.0573, s/epoch=0.2]    Fine-tune:  12%|█▏        | 7/60 [00:03<00:25,  2.06it/s, train_loss=0.0291, val_mse=-0.0573, best=-0.0573, s/epoch=0.2]    Fine-tune:  12%|█▏        | 7/60 [00:03<00:25,  2.06it/s, train_loss=0.0560, val_mse=-0.0561, best=-0.0561, s/epoch=0.2]    Fine-tune:  13%|█▎        | 8/60 [00:03<00:25,  2.07it/s, train_loss=0.0560, val_mse=-0.0561, best=-0.0561, s/epoch=0.2]    Fine-tune:  13%|█▎        | 8/60 [00:04<00:25,  2.07it/s, train_loss=0.0339, val_mse=-0.0554, best=-0.0554, s/epoch=0.2]    Fine-tune:  15%|█▌        | 9/60 [00:04<00:24,  2.08it/s, train_loss=0.0339, val_mse=-0.0554, best=-0.0554, s/epoch=0.2]    Fine-tune:  15%|█▌        | 9/60 [00:04<00:24,  2.08it/s, train_loss=0.5892, val_mse=-0.0573, best=-0.0554, s/epoch=0.2]    Fine-tune:  17%|█▋        | 10/60 [00:04<00:23,  2.10it/s, train_loss=0.5892, val_mse=-0.0573, best=-0.0554, s/epoch=0.2]    Fine-tune:  17%|█▋        | 10/60 [00:05<00:23,  2.10it/s, train_loss=0.0403, val_mse=-0.0591, best=-0.0554, s/epoch=0.2]    Fine-tune:  18%|█▊        | 11/60 [00:05<00:23,  2.12it/s, train_loss=0.0403, val_mse=-0.0591, best=-0.0554, s/epoch=0.2]    Fine-tune:  18%|█▊        | 11/60 [00:05<00:23,  2.12it/s, train_loss=0.0706, val_mse=-0.0608, best=-0.0554, s/epoch=0.2]    Fine-tune:  20%|██        | 12/60 [00:05<00:22,  2.13it/s, train_loss=0.0706, val_mse=-0.0608, best=-0.0554, s/epoch=0.2]    Fine-tune:  20%|██        | 12/60 [00:06<00:22,  2.13it/s, train_loss=0.0553, val_mse=-0.0605, best=-0.0554, s/epoch=0.2]    Fine-tune:  22%|██▏       | 13/60 [00:06<00:22,  2.13it/s, train_loss=0.0553, val_mse=-0.0605, best=-0.0554, s/epoch=0.2]    Fine-tune:  22%|██▏       | 13/60 [00:06<00:22,  2.13it/s, train_loss=0.0446, val_mse=-0.0606, best=-0.0554, s/epoch=0.2]    Fine-tune:  23%|██▎       | 14/60 [00:06<00:21,  2.14it/s, train_loss=0.0446, val_mse=-0.0606, best=-0.0554, s/epoch=0.2]    Fine-tune:  23%|██▎       | 14/60 [00:07<00:21,  2.14it/s, train_loss=0.0243, val_mse=-0.0602, best=-0.0554, s/epoch=0.2]    Fine-tune:  25%|██▌       | 15/60 [00:07<00:20,  2.15it/s, train_loss=0.0243, val_mse=-0.0602, best=-0.0554, s/epoch=0.2]    Fine-tune:  25%|██▌       | 15/60 [00:07<00:20,  2.15it/s, train_loss=0.0596, val_mse=-0.0584, best=-0.0554, s/epoch=0.2]    Fine-tune:  27%|██▋       | 16/60 [00:07<00:20,  2.15it/s, train_loss=0.0596, val_mse=-0.0584, best=-0.0554, s/epoch=0.2]    Fine-tune:  27%|██▋       | 16/60 [00:08<00:20,  2.15it/s, train_loss=0.0641, val_mse=-0.0565, best=-0.0554, s/epoch=0.2]    Fine-tune:  28%|██▊       | 17/60 [00:08<00:19,  2.15it/s, train_loss=0.0641, val_mse=-0.0565, best=-0.0554, s/epoch=0.2]    Fine-tune:  28%|██▊       | 17/60 [00:08<00:19,  2.15it/s, train_loss=0.0693, val_mse=-0.0543, best=-0.0543, s/epoch=0.2]    Fine-tune:  30%|███       | 18/60 [00:08<00:19,  2.13it/s, train_loss=0.0693, val_mse=-0.0543, best=-0.0543, s/epoch=0.2]    Fine-tune:  30%|███       | 18/60 [00:09<00:19,  2.13it/s, train_loss=0.0421, val_mse=-0.0522, best=-0.0522, s/epoch=0.2]    Fine-tune:  32%|███▏      | 19/60 [00:09<00:19,  2.12it/s, train_loss=0.0421, val_mse=-0.0522, best=-0.0522, s/epoch=0.2]    Fine-tune:  32%|███▏      | 19/60 [00:09<00:19,  2.12it/s, train_loss=0.0558, val_mse=-0.0496, best=-0.0496, s/epoch=0.2]    Fine-tune:  33%|███▎      | 20/60 [00:09<00:18,  2.11it/s, train_loss=0.0558, val_mse=-0.0496, best=-0.0496, s/epoch=0.2]    Fine-tune:  33%|███▎      | 20/60 [00:10<00:18,  2.11it/s, train_loss=0.0392, val_mse=-0.0471, best=-0.0471, s/epoch=0.2]    Fine-tune:  35%|███▌      | 21/60 [00:10<00:18,  2.11it/s, train_loss=0.0392, val_mse=-0.0471, best=-0.0471, s/epoch=0.2]    Fine-tune:  35%|███▌      | 21/60 [00:10<00:18,  2.11it/s, train_loss=0.0691, val_mse=-0.0443, best=-0.0443, s/epoch=0.2]    Fine-tune:  37%|███▋      | 22/60 [00:10<00:18,  2.10it/s, train_loss=0.0691, val_mse=-0.0443, best=-0.0443, s/epoch=0.2]    Fine-tune:  37%|███▋      | 22/60 [00:11<00:18,  2.10it/s, train_loss=0.0431, val_mse=-0.0420, best=-0.0420, s/epoch=0.2]    Fine-tune:  38%|███▊      | 23/60 [00:11<00:17,  2.09it/s, train_loss=0.0431, val_mse=-0.0420, best=-0.0420, s/epoch=0.2]    Fine-tune:  38%|███▊      | 23/60 [00:11<00:17,  2.09it/s, train_loss=0.0339, val_mse=-0.0401, best=-0.0401, s/epoch=0.2]    Fine-tune:  40%|████      | 24/60 [00:11<00:17,  2.08it/s, train_loss=0.0339, val_mse=-0.0401, best=-0.0401, s/epoch=0.2]    Fine-tune:  40%|████      | 24/60 [00:11<00:17,  2.08it/s, train_loss=0.0739, val_mse=-0.0384, best=-0.0384, s/epoch=0.2]    Fine-tune:  42%|████▏     | 25/60 [00:12<00:16,  2.08it/s, train_loss=0.0739, val_mse=-0.0384, best=-0.0384, s/epoch=0.2]    Fine-tune:  42%|████▏     | 25/60 [00:12<00:16,  2.08it/s, train_loss=0.0425, val_mse=-0.0367, best=-0.0367, s/epoch=0.2]    Fine-tune:  43%|████▎     | 26/60 [00:12<00:16,  2.07it/s, train_loss=0.0425, val_mse=-0.0367, best=-0.0367, s/epoch=0.2]    Fine-tune:  43%|████▎     | 26/60 [00:12<00:16,  2.07it/s, train_loss=0.2167, val_mse=-0.0348, best=-0.0348, s/epoch=0.2]    Fine-tune:  45%|████▌     | 27/60 [00:12<00:15,  2.07it/s, train_loss=0.2167, val_mse=-0.0348, best=-0.0348, s/epoch=0.2]    Fine-tune:  45%|████▌     | 27/60 [00:13<00:15,  2.07it/s, train_loss=0.5485, val_mse=-0.0340, best=-0.0340, s/epoch=0.2]    Fine-tune:  47%|████▋     | 28/60 [00:13<00:15,  2.07it/s, train_loss=0.5485, val_mse=-0.0340, best=-0.0340, s/epoch=0.2]    Fine-tune:  47%|████▋     | 28/60 [00:13<00:15,  2.07it/s, train_loss=0.0417, val_mse=-0.0333, best=-0.0333, s/epoch=0.2]    Fine-tune:  48%|████▊     | 29/60 [00:13<00:14,  2.08it/s, train_loss=0.0417, val_mse=-0.0333, best=-0.0333, s/epoch=0.2]    Fine-tune:  48%|████▊     | 29/60 [00:14<00:14,  2.08it/s, train_loss=0.0591, val_mse=-0.0328, best=-0.0328, s/epoch=0.2]    Fine-tune:  50%|█████     | 30/60 [00:14<00:14,  2.09it/s, train_loss=0.0591, val_mse=-0.0328, best=-0.0328, s/epoch=0.2]    Fine-tune:  50%|█████     | 30/60 [00:14<00:14,  2.09it/s, train_loss=0.0275, val_mse=-0.0324, best=-0.0324, s/epoch=0.2]    Fine-tune:  52%|█████▏    | 31/60 [00:14<00:13,  2.09it/s, train_loss=0.0275, val_mse=-0.0324, best=-0.0324, s/epoch=0.2]    Fine-tune:  52%|█████▏    | 31/60 [00:15<00:13,  2.09it/s, train_loss=0.0439, val_mse=-0.0319, best=-0.0319, s/epoch=0.2]    Fine-tune:  53%|█████▎    | 32/60 [00:15<00:13,  2.09it/s, train_loss=0.0439, val_mse=-0.0319, best=-0.0319, s/epoch=0.2]    Fine-tune:  53%|█████▎    | 32/60 [00:15<00:13,  2.09it/s, train_loss=0.0189, val_mse=-0.0314, best=-0.0314, s/epoch=0.2]    Fine-tune:  55%|█████▌    | 33/60 [00:15<00:12,  2.08it/s, train_loss=0.0189, val_mse=-0.0314, best=-0.0314, s/epoch=0.2]    Fine-tune:  55%|█████▌    | 33/60 [00:16<00:12,  2.08it/s, train_loss=0.0454, val_mse=-0.0310, best=-0.0310, s/epoch=0.2]    Fine-tune:  57%|█████▋    | 34/60 [00:16<00:12,  2.07it/s, train_loss=0.0454, val_mse=-0.0310, best=-0.0310, s/epoch=0.2]    Fine-tune:  57%|█████▋    | 34/60 [00:16<00:12,  2.07it/s, train_loss=0.0306, val_mse=-0.0307, best=-0.0307, s/epoch=0.2]    Fine-tune:  58%|█████▊    | 35/60 [00:16<00:12,  2.07it/s, train_loss=0.0306, val_mse=-0.0307, best=-0.0307, s/epoch=0.2]    Fine-tune:  58%|█████▊    | 35/60 [00:17<00:12,  2.07it/s, train_loss=0.0403, val_mse=-0.0304, best=-0.0304, s/epoch=0.2]    Fine-tune:  60%|██████    | 36/60 [00:17<00:11,  2.08it/s, train_loss=0.0403, val_mse=-0.0304, best=-0.0304, s/epoch=0.2]    Fine-tune:  60%|██████    | 36/60 [00:17<00:11,  2.08it/s, train_loss=0.0437, val_mse=-0.0301, best=-0.0301, s/epoch=0.2]    Fine-tune:  62%|██████▏   | 37/60 [00:17<00:11,  2.08it/s, train_loss=0.0437, val_mse=-0.0301, best=-0.0301, s/epoch=0.2]    Fine-tune:  62%|██████▏   | 37/60 [00:18<00:11,  2.08it/s, train_loss=0.0347, val_mse=-0.0299, best=-0.0299, s/epoch=0.2]    Fine-tune:  63%|██████▎   | 38/60 [00:18<00:10,  2.09it/s, train_loss=0.0347, val_mse=-0.0299, best=-0.0299, s/epoch=0.2]    Fine-tune:  63%|██████▎   | 38/60 [00:18<00:10,  2.09it/s, train_loss=0.0346, val_mse=-0.0298, best=-0.0298, s/epoch=0.2]    Fine-tune:  65%|██████▌   | 39/60 [00:18<00:10,  2.09it/s, train_loss=0.0346, val_mse=-0.0298, best=-0.0298, s/epoch=0.2]    Fine-tune:  65%|██████▌   | 39/60 [00:19<00:10,  2.09it/s, train_loss=0.0405, val_mse=-0.0297, best=-0.0298, s/epoch=0.2]    Fine-tune:  67%|██████▋   | 40/60 [00:19<00:09,  2.11it/s, train_loss=0.0405, val_mse=-0.0297, best=-0.0298, s/epoch=0.2]    Fine-tune:  67%|██████▋   | 40/60 [00:19<00:09,  2.11it/s, train_loss=0.0522, val_mse=-0.0296, best=-0.0296, s/epoch=0.3]    Fine-tune:  68%|██████▊   | 41/60 [00:19<00:09,  2.08it/s, train_loss=0.0522, val_mse=-0.0296, best=-0.0296, s/epoch=0.3]    Fine-tune:  68%|██████▊   | 41/60 [00:20<00:09,  2.08it/s, train_loss=0.0383, val_mse=-0.0295, best=-0.0295, s/epoch=0.2]    Fine-tune:  70%|███████   | 42/60 [00:20<00:08,  2.08it/s, train_loss=0.0383, val_mse=-0.0295, best=-0.0295, s/epoch=0.2]    Fine-tune:  70%|███████   | 42/60 [00:20<00:08,  2.08it/s, train_loss=0.0518, val_mse=-0.0295, best=-0.0295, s/epoch=0.2]    Fine-tune:  72%|███████▏  | 43/60 [00:20<00:08,  2.10it/s, train_loss=0.0518, val_mse=-0.0295, best=-0.0295, s/epoch=0.2]    Fine-tune:  72%|███████▏  | 43/60 [00:21<00:08,  2.10it/s, train_loss=0.0437, val_mse=-0.0297, best=-0.0295, s/epoch=0.2]    Fine-tune:  73%|███████▎  | 44/60 [00:21<00:07,  2.11it/s, train_loss=0.0437, val_mse=-0.0297, best=-0.0295, s/epoch=0.2]    Fine-tune:  73%|███████▎  | 44/60 [00:21<00:07,  2.11it/s, train_loss=0.0294, val_mse=-0.0298, best=-0.0295, s/epoch=0.2]    Fine-tune:  75%|███████▌  | 45/60 [00:21<00:07,  2.13it/s, train_loss=0.0294, val_mse=-0.0298, best=-0.0295, s/epoch=0.2]    Fine-tune:  75%|███████▌  | 45/60 [00:22<00:07,  2.13it/s, train_loss=0.0505, val_mse=-0.0300, best=-0.0295, s/epoch=0.2]    Fine-tune:  77%|███████▋  | 46/60 [00:22<00:06,  2.13it/s, train_loss=0.0505, val_mse=-0.0300, best=-0.0295, s/epoch=0.2]    Fine-tune:  77%|███████▋  | 46/60 [00:22<00:06,  2.13it/s, train_loss=0.0616, val_mse=-0.0302, best=-0.0295, s/epoch=0.2]    Fine-tune:  78%|███████▊  | 47/60 [00:22<00:06,  2.13it/s, train_loss=0.0616, val_mse=-0.0302, best=-0.0295, s/epoch=0.2]    Fine-tune:  78%|███████▊  | 47/60 [00:22<00:06,  2.13it/s, train_loss=0.2092, val_mse=-0.0303, best=-0.0295, s/epoch=0.2]    Fine-tune:  80%|████████  | 48/60 [00:22<00:05,  2.14it/s, train_loss=0.2092, val_mse=-0.0303, best=-0.0295, s/epoch=0.2]    Fine-tune:  80%|████████  | 48/60 [00:23<00:05,  2.14it/s, train_loss=0.0439, val_mse=-0.0304, best=-0.0295, s/epoch=0.2]    Fine-tune:  82%|████████▏ | 49/60 [00:23<00:05,  2.14it/s, train_loss=0.0439, val_mse=-0.0304, best=-0.0295, s/epoch=0.2]    Fine-tune:  82%|████████▏ | 49/60 [00:23<00:05,  2.14it/s, train_loss=0.0299, val_mse=-0.0305, best=-0.0295, s/epoch=0.2]    Fine-tune:  83%|████████▎ | 50/60 [00:23<00:04,  2.14it/s, train_loss=0.0299, val_mse=-0.0305, best=-0.0295, s/epoch=0.2]    Fine-tune:  83%|████████▎ | 50/60 [00:24<00:04,  2.14it/s, train_loss=0.0382, val_mse=-0.0306, best=-0.0295, s/epoch=0.2]    Fine-tune:  85%|████████▌ | 51/60 [00:24<00:04,  2.14it/s, train_loss=0.0382, val_mse=-0.0306, best=-0.0295, s/epoch=0.2]    Fine-tune:  85%|████████▌ | 51/60 [00:24<00:04,  2.14it/s, train_loss=0.0267, val_mse=-0.0306, best=-0.0295, s/epoch=0.2]    Fine-tune:  85%|████████▌ | 51/60 [00:24<00:04,  2.05it/s, train_loss=0.0267, val_mse=-0.0306, best=-0.0295, s/epoch=0.2]


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-2 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-2.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-2.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-2 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-2 pre {
      padding: 0;
    }

    #sk-container-id-2 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-2 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-2 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-2 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-2 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-2 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-2 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-2 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-2 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-2 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-2 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-2 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-2 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-2 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-2 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-2 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-2 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-2 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-2 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-2 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-2 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-2 div.sk-label label.sk-toggleable__label,
    #sk-container-id-2 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-2 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-2 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-2 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-2 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-2 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-2 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-2 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-2 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-2 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-2 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-2 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-2" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>FinetunedTabICLRegressor(epochs=60, n_estimators_inference=4, patience=10,
                             random_state=0, verbose=True)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-2" type="checkbox" checked><label for="sk-estimator-id-2" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>FinetunedTabICLRegressor</div></div><div><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('epochs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">epochs</td>
                <td class="value">60</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('learning_rate',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">learning_rate</td>
                <td class="value">1e-05</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weight_decay',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">weight_decay</td>
                <td class="value">0.01</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('grad_clip',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">grad_clip</td>
                <td class="value">1.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('amp',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">amp</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('use_lr_scheduler',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">use_lr_scheduler</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('warmup_proportion',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">warmup_proportion</td>
                <td class="value">0.1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_estimators_finetune',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_estimators_finetune</td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_estimators_validation',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_estimators_validation</td>
                <td class="value">2</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_estimators_inference',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">n_estimators_inference</td>
                <td class="value">4</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_data_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">max_data_size</td>
                <td class="value">10000</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('finetune_ctx_query_ratio',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">finetune_ctx_query_ratio</td>
                <td class="value">0.2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('validation_split_ratio',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">validation_split_ratio</td>
                <td class="value">0.1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('early_stopping',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">early_stopping</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('patience',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">patience</td>
                <td class="value">10</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_delta',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">min_delta</td>
                <td class="value">0.0001</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('time_limit',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">time_limit</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('save_interval',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">save_interval</td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('norm_methods',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">norm_methods</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('feat_shuffle_method',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">feat_shuffle_method</td>
                <td class="value">&#x27;latin&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('outlier_threshold',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">outlier_threshold</td>
                <td class="value">4.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('model_path',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">model_path</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('allow_auto_download',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">allow_auto_download</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('checkpoint_version',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">checkpoint_version</td>
                <td class="value">&#x27;tabicl-regressor-v2-20260212.ckpt&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('freeze_col',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">freeze_col</td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('freeze_row',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">freeze_row</td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('freeze_icl',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">freeze_icl</td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('device',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">device</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('random_state',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">random_state</td>
                <td class="value">0</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">verbose</td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('wandb_kwargs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">wandb_kwargs</td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('eval_metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">eval_metric</td>
                <td class="value">&#x27;mse&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('extra_regressor_kwargs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">extra_regressor_kwargs</td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-2');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 123-125

Evaluate on the held-out test set
---------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 125-142

.. code-block:: Python

    ft_pred = reg.predict(X_test)
    ft_mse, ft_mae, ft_r2 = _metrics(ft_pred, y_test)

    if is_main_process:
        header = f"{'metric':<10}{'pretrained':>14}{'fine-tuned':>14}{'Δ':>14}"
        rule = "=" * len(header)
        print()
        print(rule)
        print(f"Test-set metrics  (n_train={len(X_train)}, n_test={len(X_test)})")
        print(rule)
        print(header)
        print("-" * len(header))
        print(f"{'MSE ↓':<10}{base_mse:>14.4f}{ft_mse:>14.4f}{ft_mse - base_mse:>+14.4f}")
        print(f"{'MAE ↓':<10}{base_mae:>14.4f}{ft_mae:>14.4f}{ft_mae - base_mae:>+14.4f}")
        print(f"{'R² ↑':<10}{base_r2:>14.4f}{ft_r2:>14.4f}{ft_r2 - base_r2:>+14.4f}")
        print(rule)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    ====================================================
    Test-set metrics  (n_train=40, n_test=760)
    ====================================================
    metric        pretrained    fine-tuned             Δ
    ----------------------------------------------------
    MSE ↓             0.0573        0.0239       -0.0335
    MAE ↓             0.1543        0.1059       -0.0484
    R² ↑              0.8894        0.9540       +0.0646
    ====================================================


.. GENERATED FROM PYTHON SOURCE LINES 143-148

Figure 1 — Predictions + residuals
----------------------------------

Yellow band = spike FWHM; the residual gap should collapse there
under fine-tuning while the rest of the panel stays flat.

.. GENERATED FROM PYTHON SOURCE LINES 148-229

.. code-block:: Python

    if is_main_process:
        x_grid = np.linspace(-3.0, 3.0, 600).reshape(-1, 1).astype(np.float32)
        alphas = [0.1, 0.5, 0.9]
        q_base = base.predict(x_grid, output_type="quantiles", alphas=alphas)
        q_ft = reg.predict(x_grid, output_type="quantiles", alphas=alphas)

        # Quantiles on the test grid so the residual panel can show the
        # 10–90% band around zero (a calibration read at a glance).
        qt_base = base.predict(X_test, output_type="quantiles", alphas=alphas)
        qt_ft = reg.predict(X_test, output_type="quantiles", alphas=alphas)
        order = np.argsort(X_test.ravel())
        x_sorted = X_test.ravel()[order]

        fig1, axes = plt.subplots(2, 2, figsize=(13.5, 8.0), sharex=True, sharey="row", constrained_layout=True)
        top = axes[0]
        bot = axes[1]

        # Same emerald as the classifier tutorial for the "ground truth"
        # reference, so the two figures share a visual vocabulary.
        TRUTH_COLOR = "#10b981"
        for ax, title, q, (mse, r2) in [
            (top[0], "Pretrained TabICL", q_base, (base_mse, base_r2)),
            (top[1], "Fine-tuned TabICL", q_ft, (ft_mse, ft_r2)),
        ]:
            ax.fill_between(
                x_grid.ravel(),
                q[:, 0],
                q[:, 2],
                color="#60a5fa",
                alpha=0.25,
                label="10–90 % quantile band",
            )
            ax.plot(x_grid.ravel(), q[:, 1], color="#1d4ed8", lw=2.2, label="predicted median")
            ax.plot(x_grid.ravel(), target_fn(x_grid.ravel()), color=TRUTH_COLOR, lw=2.0, ls="--", label="true target")
            ax.scatter(
                X_train.ravel(),
                y_train,
                c="#b45309",
                edgecolor="white",
                s=32,
                linewidths=0.8,
                label=f"train (n={len(X_train)})",
            )
            # Shade the FWHM of the sharp spike to flag where the failure mode
            # lives. Both panels share the band so the comparison is direct.
            ax.axvspan(0.905, 1.095, color="#fde68a", alpha=0.45, zorder=0, label="spike FWHM")
            ax.set_title(f"{title}\nMSE={mse:.3f}  R²={r2:.3f}", fontsize=12)
            ax.tick_params(labelsize=10)
            ax.grid(alpha=0.25)
        top[0].set_ylabel("y", fontsize=11)
        top[0].legend(loc="lower right", framealpha=0.92, fontsize=9)

        # Residual panels: predicted − true, with the 10–90% band relative to
        # the predicted median so the shaded region is centered on zero.
        for ax, title, pred, qt in [
            (bot[0], "Residuals — pretrained", base_pred, qt_base),
            (bot[1], "Residuals — fine-tuned", ft_pred, qt_ft),
        ]:
            residual = pred - y_test
            lo = (qt[:, 0] - qt[:, 1])[order]
            hi = (qt[:, 2] - qt[:, 1])[order]
            ax.fill_between(x_sorted, lo, hi, color="#60a5fa", alpha=0.22, label="10–90 % band (centered)")
            ax.scatter(X_test.ravel(), residual, c="#334155", s=10, alpha=0.65, label="residual (pred − y)")
            ax.axhline(0, color="black", lw=0.8)
            ax.axvspan(0.905, 1.095, color="#fde68a", alpha=0.45, zorder=0, label="spike FWHM")

            ax.set_title(title, fontsize=12)
            ax.set_xlabel("x", fontsize=11)
            ax.tick_params(labelsize=10)
            ax.grid(alpha=0.25)
        bot[0].set_ylabel("residual", fontsize=11)
        bot[0].legend(loc="lower right", framealpha=0.92, fontsize=9)

        # sharey="row" already aligns the two residual panels; just widen the
        # limits symmetrically around zero so max |residual| from either model
        # fits in both.
        y_res_lim = max(np.abs(base_pred - y_test).max(), np.abs(ft_pred - y_test).max())
        bot[0].set_ylim(-y_res_lim * 1.1, y_res_lim * 1.1)

        fig1.suptitle("Predictions + residuals: pretrained vs. fine-tuned", fontsize=14)


.. image-sg:: /tutorials/images/sphx_glr_finetune_regressor_001.png
   :alt: Predictions + residuals: pretrained vs. fine-tuned, Pretrained TabICL MSE=0.057  R²=0.889, Fine-tuned TabICL MSE=0.024  R²=0.954, Residuals — pretrained, Residuals — fine-tuned
   :srcset: /tutorials/images/sphx_glr_finetune_regressor_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 230-235

Figure 2 — Training dynamics + metric comparison
------------------------------------------------

Left: val MSE per epoch; dashed line = pretrained floor, star = best
epoch kept by the safety net. Right: test-set MSE / MAE / R² bars.

.. GENERATED FROM PYTHON SOURCE LINES 235-295

.. code-block:: Python

    if is_main_process and history["epoch"]:
        fig2, (ax_tr, ax_bar) = plt.subplots(1, 2, figsize=(12.8, 4.8), constrained_layout=True)

        ep = history["epoch"]
        val_mse = history["val_mse"]
        ax_tr.plot(ep, val_mse, "o-", color="#0f766e", lw=2.0, markersize=5, label="fine-tuning: val MSE")
        ax_tr.axhline(
            base_val_mse,
            ls="--",
            color="#64748b",
            lw=1.5,
            label=f"pretrained baseline ({base_val_mse:.3f})",
        )
        best_idx = int(np.nanargmin(val_mse))
        ax_tr.scatter(
            [ep[best_idx]],
            [val_mse[best_idx]],
            marker="*",
            s=220,
            color="#f59e0b",
            edgecolor="black",
            linewidths=0.8,
            zorder=5,
            label=f"best epoch ({val_mse[best_idx]:.3f} @ epoch {ep[best_idx]})",
        )
        ax_tr.set_xlabel("epoch")
        ax_tr.set_ylabel("validation MSE (lower is better)")
        ax_tr.set_title("Validation metric across fine-tuning epochs")
        ax_tr.grid(alpha=0.3)
        ax_tr.legend(fontsize=9, loc="upper right")

        metric_names = ["MSE ↓", "MAE ↓", "R² ↑"]
        base_vals = [base_mse, base_mae, base_r2]
        ft_vals = [ft_mse, ft_mae, ft_r2]
        x_pos = np.arange(len(metric_names))
        w = 0.38
        bars_b = ax_bar.bar(x_pos - w / 2, base_vals, w, color="#64748b", label="pretrained")
        bars_f = ax_bar.bar(x_pos + w / 2, ft_vals, w, color="#0f766e", label="fine-tuned")
        for bars, vals in [(bars_b, base_vals), (bars_f, ft_vals)]:
            for rect, v in zip(bars, vals):
                y_anchor = v + (0.02 if v >= 0 else -0.04)
                ax_bar.text(
                    rect.get_x() + rect.get_width() / 2,
                    y_anchor,
                    f"{v:.3f}",
                    ha="center",
                    va="bottom" if v >= 0 else "top",
                    fontsize=8,
                )
        ax_bar.set_xticks(x_pos)
        ax_bar.set_xticklabels(metric_names)
        ax_bar.set_title("Test-set metrics: pretrained vs. fine-tuned")
        ax_bar.set_ylabel("metric value")
        ax_bar.axhline(0, color="black", lw=0.5)
        ax_bar.grid(alpha=0.25, axis="y")
        ax_bar.legend(fontsize=9, loc="upper right")

        fig2.suptitle("Training dynamics & test-set gains", fontsize=13)
        plt.show()


.. image-sg:: /tutorials/images/sphx_glr_finetune_regressor_002.png
   :alt: Training dynamics & test-set gains, Validation metric across fine-tuning epochs, Test-set metrics: pretrained vs. fine-tuned
   :srcset: /tutorials/images/sphx_glr_finetune_regressor_002.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 36.150 seconds)


.. _sphx_glr_download_tutorials_finetune_regressor.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: finetune_regressor.ipynb <finetune_regressor.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: finetune_regressor.py <finetune_regressor.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: finetune_regressor.zip <finetune_regressor.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_