2026.TA.gemma2_2b_tc8192_decb_l1w0.001_tarbb_lb2.0_ln1_dr20000_lr8e-04_bs4_sl14793860

Sparse transcoder adapter trained with bridging mode.

Model Details

Transcoder Configuration

  • n_features: 8192
  • dec_bias: True
  • l1_weight: 0.001

Training

  • Learning rate: 0.0008
  • Batch size: 4
  • Epochs: 1
  • Warmup ratio: 0.05
  • Loss type: kl
  • lambda_adapt: 1.0
  • lambda_bridge: 2.0
  • lambda_nmse: 1
  • n_cutoffs: 1
  • backbone: target

Training Data

Downloads last month
29
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for siddharthmb/2026.TA.gemma2_2b_tc8192_decb_l1w0.001_tarbb_lb2.0_ln1_dr20000_lr8e-04_bs4_sl14793860

Finetuned
(428)
this model

Dataset used to train siddharthmb/2026.TA.gemma2_2b_tc8192_decb_l1w0.001_tarbb_lb2.0_ln1_dr20000_lr8e-04_bs4_sl14793860