Update README.md
Browse files
README.md
CHANGED
|
@@ -5,66 +5,4 @@ base_model:
|
|
| 5 |
- google/gemma-3-1b-it
|
| 6 |
pipeline_tag: text-generation
|
| 7 |
---
|
| 8 |
-
|
| 9 |
-
# Model Card: Parveshiiii/M1-MathX
|
| 10 |
-
|
| 11 |
-
## Model Details
|
| 12 |
-
- **Model Name:** Parveshiiii/M1-MathX
|
| 13 |
-
- **Base Architecture:** Gemma (1B parameters)
|
| 14 |
-
- **Model Type:** Causal Language Model (text-generation)
|
| 15 |
-
- **Training Framework:** Hugging Face Transformers
|
| 16 |
-
- **Precision:** fp16
|
| 17 |
-
- **Attention Mechanism:** Hybrid sliding-window and full attention layers
|
| 18 |
-
- **Tokenizer:** Gemma tokenizer (vocab size 262,144)
|
| 19 |
-
|
| 20 |
-
## Usage
|
| 21 |
-
```python
|
| 22 |
-
from transformers import pipeline, TextStreamer
|
| 23 |
-
|
| 24 |
-
pipe = pipeline("text-generation", model="Parveshiiii/M1-MathX")
|
| 25 |
-
messages = [
|
| 26 |
-
{"role": "user", "content": "Who are you?"},
|
| 27 |
-
]
|
| 28 |
-
streamer = TextStreamer(pipe.tokenizer)
|
| 29 |
-
pipe(messages, streamer=streamer, max_new_tokens=10000)
|
| 30 |
-
```
|
| 31 |
-
## Intended Use
|
| 32 |
-
- Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations.
|
| 33 |
-
- Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment.
|
| 34 |
-
- Not intended for general-purpose conversation or sensitive domains outside mathematics.
|
| 35 |
-
|
| 36 |
-
## Training Data
|
| 37 |
-
- **Dataset:** MathX (curated mathematical reasoning dataset)
|
| 38 |
-
- **Samples Used:** ~300
|
| 39 |
-
- **Training Steps:** 50
|
| 40 |
-
- **Method:** GRPO (Group Relative Policy Optimization) fine-tuning
|
| 41 |
-
- **Objective:** Reinforcement-style alignment for improved reasoning clarity and correctness.
|
| 42 |
-
|
| 43 |
-
## Performance
|
| 44 |
-
- Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks.
|
| 45 |
-
- Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets.
|
| 46 |
-
- Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison.
|
| 47 |
-
|
| 48 |
-
## Limitations
|
| 49 |
-
- Small dataset and limited training steps mean coverage is narrow.
|
| 50 |
-
- May overfit to MathX patterns and fail on broader or more complex problems.
|
| 51 |
-
- Not guaranteed to generalize outside mathematical reasoning.
|
| 52 |
-
- As a 1B model, capacity is limited compared to larger LLMs.
|
| 53 |
-
|
| 54 |
-
## Ethical Considerations
|
| 55 |
-
- Intended for safe educational use.
|
| 56 |
-
- Should not be deployed in high-stakes environments without further validation.
|
| 57 |
-
- Outputs may contain errors; human oversight is required.
|
| 58 |
-
|
| 59 |
-
## Citation
|
| 60 |
-
If you use this model, please cite as:
|
| 61 |
-
```
|
| 62 |
-
@misc{Parvesh2025M1MathX,
|
| 63 |
-
author = {Parvesh Rawal},
|
| 64 |
-
title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO},
|
| 65 |
-
year = {2025},
|
| 66 |
-
howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}}
|
| 67 |
-
}
|
| 68 |
-
```
|
| 69 |
-
|
| 70 |
-
---
|
|
|
|
| 5 |
- google/gemma-3-1b-it
|
| 6 |
pipeline_tag: text-generation
|
| 7 |
---
|
| 8 |
+
This model was fine‑tuned with GRPO for only 50 steps using 4 samples per step. The result is exceptionally high accuracy on JEE‑level mathematics problems, though its broader context handling and instruction‑following abilities were diminished. In essence, it has become a compact powerhouse — a “mini‑tank” built for raw mathematical problem‑solving rather than nuanced reasoning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|