Matryoshka Representation Learning
Paper • 2205.13147 • Published • 25
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("AneetaXavier/reformer-pilates-embed-ft-49fc1835-9968-433d-9c45-1538ea91dcc9")
# Run inference
sentences = [
'What modifications are suggested if the exercise feels too intense on the arms or wrists?',
"spine relax the shoulders lift the head\nand again\nhead goes down Pike it up inhale\nand exhale roll through\n[Applause]\ngood keep going here if this is too\nintense on the arms or wrists especially\nyou're going to do the same thing here\nPike it up\non your knees and roll through my knees\nare just kind of facing over to the left\nside Pike it up\ninhale and exhale roll\ngood two more you guys you're doing so\ngood it's intense I know\nroll through\nand lift\nlast one\nand finishing that Pike good you guys\ntake those feet\nonto the carriage catch your breath if\nyou want lean it back if you can lift\nyour foot bar to find that click to kind\nof lean back stretch through your\nshoulders kind of depending on your\nreformer if yours is able to pull back",
"towards the spine but keep the spine in\na neutral position fully straighten the\nlegs when you straighten them and now\ninto VMO knock-knees okay so your toes\nare exactly where they are you push out\nkeeping the knees together go all the\nway back into the stopper and then\nwithin that range you're going to do 20\nof these so the knees are together\nthroughout the whole of the exercise the\ntoes are on the bar as they were in the\nV position but then the heels are out\nwider so it's like a knocked knee this\nreally gets into the muscles on the\ninside of the knees and the inside of\nthe legs in through the nose out through\nthe mouth\nexpanding the ribs and then contracting\nthe abdominals keeping the muscles in\nthe legs engaged throughout prehensile",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.7333 |
| cosine_accuracy@3 | 0.9667 |
| cosine_accuracy@5 | 1.0 |
| cosine_accuracy@10 | 1.0 |
| cosine_precision@1 | 0.7333 |
| cosine_precision@3 | 0.3222 |
| cosine_precision@5 | 0.2 |
| cosine_precision@10 | 0.1 |
| cosine_recall@1 | 0.7333 |
| cosine_recall@3 | 0.9667 |
| cosine_recall@5 | 1.0 |
| cosine_recall@10 | 1.0 |
| cosine_ndcg@10 | 0.876 |
| cosine_mrr@10 | 0.8344 |
| cosine_map@100 | 0.8344 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
What equipment and spring settings does Dez recommend for starting the Pilates reformer workout? |
[Music] |
How does Dez suggest protecting the neck during the hip rolls exercise? |
[Music] |
What is the correct breathing technique to use while rocking between imprint and neutral spine positions? |
heels on the bar hip distance there we |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 30multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 30max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | cosine_ndcg@10 |
|---|---|---|
| 1.0 | 12 | 0.8455 |
| 2.0 | 24 | 0.8970 |
| 3.0 | 36 | 0.9064 |
| 4.0 | 48 | 0.9237 |
| 4.1667 | 50 | 0.9360 |
| 5.0 | 60 | 0.8633 |
| 6.0 | 72 | 0.9016 |
| 7.0 | 84 | 0.8814 |
| 8.0 | 96 | 0.8676 |
| 8.3333 | 100 | 0.8599 |
| 9.0 | 108 | 0.8633 |
| 10.0 | 120 | 0.8903 |
| 11.0 | 132 | 0.8760 |
| 12.0 | 144 | 0.8793 |
| 12.5 | 150 | 0.8960 |
| 13.0 | 156 | 0.8970 |
| 14.0 | 168 | 0.8970 |
| 15.0 | 180 | 0.9026 |
| 16.0 | 192 | 0.8903 |
| 16.6667 | 200 | 0.8804 |
| 17.0 | 204 | 0.8927 |
| 18.0 | 216 | 0.9093 |
| 19.0 | 228 | 0.8960 |
| 20.0 | 240 | 0.8916 |
| 20.8333 | 250 | 0.8916 |
| 21.0 | 252 | 0.8916 |
| 22.0 | 264 | 0.8927 |
| 23.0 | 276 | 0.8916 |
| 24.0 | 288 | 0.8916 |
| 25.0 | 300 | 0.8750 |
| 26.0 | 312 | 0.8750 |
| 27.0 | 324 | 0.8627 |
| 28.0 | 336 | 0.8637 |
| 29.0 | 348 | 0.8760 |
| 29.1667 | 350 | 0.8760 |
| 30.0 | 360 | 0.8760 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
Snowflake/snowflake-arctic-embed-l