phuongntc commited on
Commit
2bcedff
·
verified ·
1 Parent(s): 97fb487

Upload MultiEvalVietSum: weights, tokenizer, config, code, and model card

Browse files
.gitattributes CHANGED
@@ -1,35 +1,2 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
1
  *.bin filter=lfs diff=lfs merge=lfs -text
2
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - vi
4
+ library_name: transformers
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - vietnamese
8
+ - summarization
9
+ - evaluation
10
+ - cross-encoder
11
+ - research
12
+ ---
13
+
14
+ # MultiEvalVietSum
15
+
16
+ MultiEvalVietSum is a Vietnamese summary evaluation model released under the Hugging Face account phuongntc.
17
+
18
+ It is a criterion-specific cross-encoder evaluator that takes a source document and a candidate summary as input and outputs three scalar scores:
19
+ - Faithfulness
20
+ - Coherence
21
+ - Relevance
22
+
23
+ ## Model description
24
+
25
+ This model is built on top of the multilingual long-context encoder jhu-clsp/mmBERT-base and fine-tuned as a custom evaluator for Vietnamese summarization research.
26
+
27
+ Architecture summary:
28
+ - Backbone: jhu-clsp/mmBERT-base
29
+ - Input format: (document, summary) pair
30
+ - Pooling: CLS + mean pooling
31
+ - Prediction heads: three scalar regression heads
32
+ - Criteria: faithfulness, coherence, relevance
33
+ - Training objective: MSE regression + pairwise margin ranking loss
34
+
35
+ ## Intended use
36
+
37
+ This model is intended for:
38
+ - research on automatic summary evaluation in Vietnamese
39
+ - system comparison for Vietnamese summarization
40
+ - criterion-specific scoring of candidate summaries against a source document
41
+
42
+ This model is not intended to replace human judgment in high-stakes evaluation settings.
43
+
44
+ ## Input processing
45
+
46
+ The evaluator uses a pairwise input construction strategy:
47
+ - the summary is truncated first up to SUM_MAX_LEN = 192
48
+ - the remaining token budget is assigned to the source document
49
+ - the total pair length is capped at MAX_LEN = 2048
50
+
51
+ This design prioritizes source-document evidence during evaluation.
52
+
53
+ ## Reported setup
54
+
55
+ - model_name: MultiEvalVietSum
56
+ - repo_id: phuongntc/MultiEvalVietSum
57
+ - backbone: jhu-clsp/mmBERT-base
58
+ - task: Vietnamese summary evaluation
59
+ - max_len: 2048
60
+ - summary_max_len: 192
61
+ - pooling: CLS + mean pooling
62
+ - outputs: faithfulness, coherence, relevance
63
+
64
+ Validation metrics:
65
+ - val_pearson_faith: None
66
+ - val_pearson_coh: None
67
+ - val_pearson_rel: None
68
+ - val_pearson_mean: None
69
+ - val_spearman_faith: None
70
+ - val_spearman_coh: None
71
+ - val_spearman_rel: None
72
+ - val_spearman_mean: None
73
+
74
+ ## Output format
75
+
76
+ The model outputs three scalar scores:
77
+ 1. faithfulness
78
+ 2. coherence
79
+ 3. relevance
80
+
81
+ Users may optionally combine them into an overall score using a weighting scheme appropriate for their study.
82
+
83
+ ## Limitations
84
+
85
+ - The model only sees the truncated (document, summary) pair defined by the preprocessing pipeline
86
+ - Very long documents may be partially invisible to the evaluator
87
+ - If a candidate summary is longer than the summary cap, only the visible portion is evaluated
88
+ - Performance may vary across domains outside the training or evaluation distribution
89
+
90
+ ## Transparency and reproducibility notes
91
+
92
+ To reproduce scores as closely as possible, users should keep the following consistent:
93
+ - backbone model
94
+ - tokenizer
95
+ - MAX_LEN
96
+ - SUM_MAX_LEN
97
+ - pair construction rule
98
+ - model architecture and checkpoint
99
+
100
+ The repository includes:
101
+ - tokenizer files
102
+ - evaluator weights
103
+ - a custom loader file
104
+ - an inference example
105
+ - a training summary file
106
+
107
+ ## How to use
108
+
109
+ After downloading the repo, use the included files:
110
+ - modeling_multievalvietsum.py
111
+ - inference_example.py
112
+
113
+ Example:
114
+ 1. Download or clone the repository
115
+ 2. Open Python in that folder
116
+ 3. Run:
117
+ from inference_example import predict_scores
118
+ scores = predict_scores("Văn bản gốc", "Bản tóm tắt", model_dir=".")
119
+ print(scores)
120
+
121
+ ## Citation
122
+
123
+ @misc{phuong2026multievalvietsum,
124
+ title={MultiEvalVietSum: A Vietnamese Criterion-Specific Evaluator for Summary Assessment},
125
+ author={Phuong N. T. and collaborators},
126
+ year={2026},
127
+ note={Model card and code release on Hugging Face},
128
+ howpublished={\url{https://huggingface.co/phuongntc/MultiEvalVietSum}}
129
+ }
inference_example.py ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from modeling_multievalvietsum import MultiEvalVietSumModel
3
+
4
+
5
+ def build_pair_feature(tokenizer, document, summary, max_len=2048, summary_max_len=192):
6
+ sum_ids = tokenizer(
7
+ summary,
8
+ truncation=True,
9
+ max_length=summary_max_len,
10
+ add_special_tokens=False,
11
+ return_attention_mask=False,
12
+ )["input_ids"]
13
+
14
+ doc_ids = tokenizer(
15
+ document,
16
+ truncation=False,
17
+ add_special_tokens=False,
18
+ return_attention_mask=False,
19
+ )["input_ids"]
20
+
21
+ special_pair_tokens = tokenizer.num_special_tokens_to_add(pair=True)
22
+ doc_budget = max(16, max_len - len(sum_ids) - special_pair_tokens)
23
+ doc_ids = doc_ids[:doc_budget]
24
+
25
+ model_inputs = getattr(tokenizer, "model_input_names", [])
26
+ return_token_type_ids = "token_type_ids" in model_inputs
27
+
28
+ try:
29
+ feat = tokenizer.prepare_for_model(
30
+ doc_ids,
31
+ pair_ids=sum_ids,
32
+ add_special_tokens=True,
33
+ padding=False,
34
+ truncation=False,
35
+ return_attention_mask=True,
36
+ return_token_type_ids=return_token_type_ids,
37
+ )
38
+ feat = {k: v for k, v in feat.items() if k in {"input_ids", "attention_mask", "token_type_ids"}}
39
+ return feat
40
+ except Exception:
41
+ cls_id = tokenizer.cls_token_id
42
+ sep_id = tokenizer.sep_token_id
43
+ input_ids = [cls_id] + doc_ids + [sep_id] + sum_ids + [sep_id]
44
+ attention_mask = [1] * len(input_ids)
45
+ feat = {
46
+ "input_ids": input_ids,
47
+ "attention_mask": attention_mask,
48
+ }
49
+ if return_token_type_ids:
50
+ feat["token_type_ids"] = [0] * (len(doc_ids) + 2) + [1] * (len(sum_ids) + 1)
51
+ return feat
52
+
53
+
54
+ @torch.no_grad()
55
+ def predict_scores(document: str, summary: str, model_dir: str = "."):
56
+ model, cfg = MultiEvalVietSumModel.from_pretrained_local(model_dir)
57
+ tokenizer = MultiEvalVietSumModel.load_tokenizer_local(model_dir)
58
+
59
+ feat = build_pair_feature(
60
+ tokenizer,
61
+ document=document,
62
+ summary=summary,
63
+ max_len=cfg["max_len"],
64
+ summary_max_len=cfg["summary_max_len"],
65
+ )
66
+
67
+ batch = {
68
+ "input_ids": torch.tensor([feat["input_ids"]], dtype=torch.long),
69
+ "attention_mask": torch.tensor([feat["attention_mask"]], dtype=torch.long),
70
+ }
71
+ if "token_type_ids" in feat:
72
+ batch["token_type_ids"] = torch.tensor([feat["token_type_ids"]], dtype=torch.long)
73
+
74
+ scores = model(**batch)[0].cpu().tolist()
75
+ return {
76
+ "faithfulness": float(scores[0]),
77
+ "coherence": float(scores[1]),
78
+ "relevance": float(scores[2]),
79
+ }
80
+
81
+
82
+ if __name__ == "__main__":
83
+ doc = "Văn bản gốc mẫu."
84
+ summ = "Bản tóm tắt mẫu."
85
+ print(predict_scores(doc, summ))
modeling_multievalvietsum.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ from pathlib import Path
3
+
4
+ import torch
5
+ import torch.nn as nn
6
+ from transformers import AutoModel, AutoTokenizer
7
+
8
+
9
+ def mean_pool(last_hidden_state: torch.Tensor, attention_mask: torch.Tensor) -> torch.Tensor:
10
+ mask = attention_mask.unsqueeze(-1).float()
11
+ masked = last_hidden_state * mask
12
+ denom = mask.sum(dim=1).clamp(min=1e-6)
13
+ return masked.sum(dim=1) / denom
14
+
15
+
16
+ class MultiEvalVietSumModel(nn.Module):
17
+ def __init__(self, backbone_name: str):
18
+ super().__init__()
19
+ self.backbone_name = backbone_name
20
+ self.model = AutoModel.from_pretrained(backbone_name)
21
+ hidden = self.model.config.hidden_size
22
+
23
+ self.trunk = nn.Sequential(
24
+ nn.Linear(hidden * 2, 256),
25
+ nn.GELU(),
26
+ nn.Dropout(0.1),
27
+ )
28
+ self.head_faith = nn.Linear(256, 1)
29
+ self.head_coh = nn.Linear(256, 1)
30
+ self.head_rel = nn.Linear(256, 1)
31
+
32
+ def forward(self, input_ids, attention_mask, token_type_ids=None):
33
+ kwargs = {
34
+ "input_ids": input_ids,
35
+ "attention_mask": attention_mask,
36
+ }
37
+ if token_type_ids is not None:
38
+ kwargs["token_type_ids"] = token_type_ids
39
+
40
+ out = self.model(**kwargs)
41
+ cls_vec = out.last_hidden_state[:, 0]
42
+ mean_vec = mean_pool(out.last_hidden_state, attention_mask)
43
+ pooled = torch.cat([cls_vec, mean_vec], dim=-1)
44
+ z = self.trunk(pooled)
45
+
46
+ faith = self.head_faith(z)
47
+ coh = self.head_coh(z)
48
+ rel = self.head_rel(z)
49
+ return torch.cat([faith, coh, rel], dim=1)
50
+
51
+ @classmethod
52
+ def from_pretrained_local(cls, model_dir: str):
53
+ model_dir = Path(model_dir)
54
+ with open(model_dir / "multievalvietsum_config.json", "r", encoding="utf-8") as f:
55
+ cfg = json.load(f)
56
+
57
+ model = cls(backbone_name=cfg["backbone_name"])
58
+ state_dict = torch.load(model_dir / "pytorch_model.bin", map_location="cpu")
59
+ model.load_state_dict(state_dict, strict=True)
60
+ model.eval()
61
+ return model, cfg
62
+
63
+ @staticmethod
64
+ def load_tokenizer_local(model_dir: str):
65
+ return AutoTokenizer.from_pretrained(model_dir, use_fast=True)
multievalvietsum_config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "multievalvietsum",
3
+ "repo_id": "phuongntc/MultiEvalVietSum",
4
+ "backbone_name": "jhu-clsp/mmBERT-base",
5
+ "max_len": 2048,
6
+ "summary_max_len": 192,
7
+ "pooling": "cls_plus_mean",
8
+ "outputs": [
9
+ "faithfulness",
10
+ "coherence",
11
+ "relevance"
12
+ ],
13
+ "notes": [
14
+ "Custom evaluator architecture on top of a Hugging Face backbone",
15
+ "Use modeling_multievalvietsum.py to load the model correctly"
16
+ ]
17
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a8b38a5508327b1f79a42074c322d9bb628ee56f60600d612c7481c124fb89d1
3
+ size 1229393435
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:609d8f4c067cd3950f88594c5a802616cea245823836ef5848ee4fc40aab5b6f
3
+ size 34363188
tokenizer_config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "bos_token": "<bos>",
4
+ "clean_up_tokenization_spaces": false,
5
+ "cls_token": "<bos>",
6
+ "eos_token": "<eos>",
7
+ "extra_special_tokens": [
8
+ "<start_of_turn>",
9
+ "<end_of_turn>"
10
+ ],
11
+ "is_local": false,
12
+ "mask_token": "<mask>",
13
+ "model_input_names": [
14
+ "input_ids",
15
+ "attention_mask"
16
+ ],
17
+ "model_max_length": 8192,
18
+ "pad_token": "<pad>",
19
+ "padding_side": "right",
20
+ "sep_token": "<eos>",
21
+ "spaces_between_special_tokens": false,
22
+ "tokenizer_class": "TokenizersBackend",
23
+ "unk_token": "<unk>"
24
+ }
training_summary.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_name": "MultiEvalVietSum",
3
+ "repo_id": "phuongntc/MultiEvalVietSum",
4
+ "backbone": "jhu-clsp/mmBERT-base",
5
+ "task": "Vietnamese summary evaluation",
6
+ "architecture": {
7
+ "type": "cross-encoder evaluator",
8
+ "pooling": "CLS + mean pooling",
9
+ "heads": [
10
+ "faithfulness",
11
+ "coherence",
12
+ "relevance"
13
+ ],
14
+ "loss": "MSE regression + pairwise margin ranking loss"
15
+ },
16
+ "tokenization": {
17
+ "max_len": 2048,
18
+ "summary_max_len": 192,
19
+ "pair_construction": "summary truncated first; remaining token budget prioritized for document"
20
+ },
21
+ "reported_metrics": {
22
+ "validation": {
23
+ "val_pearson_faith": null,
24
+ "val_pearson_coh": null,
25
+ "val_pearson_rel": null,
26
+ "val_pearson_mean": null,
27
+ "val_spearman_faith": null,
28
+ "val_spearman_coh": null,
29
+ "val_spearman_rel": null,
30
+ "val_spearman_mean": null
31
+ }
32
+ },
33
+ "intended_use": [
34
+ "Evaluate Vietnamese summaries with respect to a source document",
35
+ "Support research on automatic summary evaluation in Vietnamese",
36
+ "Provide criterion-specific scores for faithfulness, coherence, and relevance"
37
+ ],
38
+ "limitations": [
39
+ "This model is an automatic evaluator, not a text generator",
40
+ "Scores are proxy judgments and should not replace careful human evaluation in high-stakes settings",
41
+ "Performance may degrade on out-of-domain data",
42
+ "The evaluator only sees the truncated input pair defined by MAX_LEN and SUM_MAX_LEN"
43
+ ],
44
+ "transparency_notes": [
45
+ "The model consumes a document-summary pair and outputs three scalar scores",
46
+ "Users should report exact preprocessing and truncation settings when reproducing experiments",
47
+ "For long documents, content beyond the token budget is not visible to the evaluator"
48
+ ],
49
+ "citation_bibtex": "@misc{phuong2026multievalvietsum,\n title={MultiEvalVietSum: A Vietnamese Criterion-Specific Evaluator for Summary Assessment},\n author={Phuong N. T. and collaborators},\n year={2026},\n note={Model card and code release on Hugging Face},\n howpublished={\\url{https://huggingface.co/phuongntc/MultiEvalVietSum}}\n}"
50
+ }