Parveshiiii
/

M1-MathX

Text Generation

Model card Files Files and versions

Parveshiiii commited on 19 days ago

Commit

4a3f223

·

verified ·

1 Parent(s): 82743e4

Update README.md

Files changed (1) hide show

README.md +1 -63

README.md CHANGED Viewed

@@ -5,66 +5,4 @@ base_model:
 - google/gemma-3-1b-it
 pipeline_tag: text-generation
 ---
-# Model Card: Parveshiiii/M1-MathX
-## Model Details
-- **Model Name:** Parveshiiii/M1-MathX
-- **Base Architecture:** Gemma (1B parameters)
-- **Model Type:** Causal Language Model (text-generation)
-- **Training Framework:** Hugging Face Transformers
-- **Precision:** fp16
-- **Attention Mechanism:** Hybrid sliding-window and full attention layers
-- **Tokenizer:** Gemma tokenizer (vocab size 262,144)
-## Usage
-```python
-from transformers import pipeline, TextStreamer
-pipe = pipeline("text-generation", model="Parveshiiii/M1-MathX")
-messages = [
-    {"role": "user", "content": "Who are you?"},
-]
-streamer = TextStreamer(pipe.tokenizer)
-pipe(messages, streamer=streamer, max_new_tokens=10000)
-```
-## Intended Use
-- Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations.
-- Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment.
-- Not intended for general-purpose conversation or sensitive domains outside mathematics.
-## Training Data
-- **Dataset:** MathX (curated mathematical reasoning dataset)
-- **Samples Used:** ~300
-- **Training Steps:** 50
-- **Method:** GRPO (Group Relative Policy Optimization) fine-tuning
-- **Objective:** Reinforcement-style alignment for improved reasoning clarity and correctness.
-## Performance
-- Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks.
-- Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets.
-- Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison.
-## Limitations
-- Small dataset and limited training steps mean coverage is narrow.
-- May overfit to MathX patterns and fail on broader or more complex problems.
-- Not guaranteed to generalize outside mathematical reasoning.
-- As a 1B model, capacity is limited compared to larger LLMs.
-## Ethical Considerations
-- Intended for safe educational use.
-- Should not be deployed in high-stakes environments without further validation.
-- Outputs may contain errors; human oversight is required.
-## Citation
-If you use this model, please cite as:
-```
-@misc{Parvesh2025M1MathX,
-  author = {Parvesh Rawal},
-  title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO},
-  year = {2025},
-  howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}}
-}
-```
----

 - google/gemma-3-1b-it
 pipeline_tag: text-generation
 ---
+This model was fine‑tuned with GRPO for only 50 steps using 4 samples per step. The result is exceptionally high accuracy on JEE‑level mathematics problems, though its broader context handling and instruction‑following abilities were diminished. In essence, it has become a compact powerhouse — a “mini‑tank” built for raw mathematical problem‑solving rather than nuanced reasoning.