Instructions to use Tanaybh/nano-gpt-from-scratch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Tanaybh/nano-gpt-from-scratch with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Tanaybh/nano-gpt-from-scratch")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Tanaybh/nano-gpt-from-scratch") model = AutoModelForCausalLM.from_pretrained("Tanaybh/nano-gpt-from-scratch") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Tanaybh/nano-gpt-from-scratch with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Tanaybh/nano-gpt-from-scratch" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Tanaybh/nano-gpt-from-scratch", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Tanaybh/nano-gpt-from-scratch
- SGLang
How to use Tanaybh/nano-gpt-from-scratch with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Tanaybh/nano-gpt-from-scratch" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Tanaybh/nano-gpt-from-scratch", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Tanaybh/nano-gpt-from-scratch" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Tanaybh/nano-gpt-from-scratch", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Tanaybh/nano-gpt-from-scratch with Docker Model Runner:
docker model run hf.co/Tanaybh/nano-gpt-from-scratch
π€ Nano GPT - Built From Scratch
Hey there! Welcome to my tiny language model. I built this GPT from scratch as a learning project, and honestly, it was pretty fun watching it learn to generate text!
What is this?
This is a super small GPT-2 style language model that I trained on my laptop. It's not going to write your essays or solve world hunger, but it's a cool demonstration of how these language models actually work under the hood.
Think of it as a baby GPT - it can generate text, but don't expect Shakespeare. More like... an enthusiastic toddler who just learned to talk.
Model Stats
- Parameters: ~1,065,728 (yes, that's million with an M, not billion!)
- Layers: 4 transformer layers
- Embedding Size: 128 dimensions
- Attention Heads: 4 heads
- Context Length: 128 tokens
- Vocab Size: 2000 tokens
- Training Data: WikiText-2 (5,000 samples)
- Training Time: 10 epochs on my laptop
Quick Start
Want to try it out? Here's how:
from transformers import pipeline
# Load the model
generator = pipeline('text-generation', model='Tanaybh/nano-gpt-from-scratch')
# Generate some text
output = generator(
"The meaning of life is",
max_new_tokens=30,
do_sample=True,
temperature=0.8
)
print(output[0]['generated_text'])
Example Output
I gave it the prompt: "**The **"
And it generated:
The Γ 60 munitions, and injuries were found in the taxonomy in the south, the east of the
Not bad for a tiny model trained in a few hours, right?
Training Details
I trained this model from scratch using:
- Custom BPE tokenizer (trained on the same data)
- GPT-2 architecture (just way smaller)
- AdamW optimizer with a learning rate of 0.0005
- Batch size of 8
- Trained for 10 epochs
The whole thing runs on a regular laptop - no fancy GPU clusters needed!
Limitations
Let's be real here:
- This model is TINY. Like, really tiny. It has 1,065,728 parameters vs GPT-3's 175 billion.
- It was only trained on 5,000 Wikipedia samples, so its knowledge is... limited.
- It might generate weird or nonsensical text sometimes. That's part of the charm!
- Maximum context length is only 128 tokens, so don't expect long conversations.
- It's a base model with no instruction tuning, so it just continues text rather than following commands.
Why I Made This
I wanted to understand how language models work by building one myself. Sure, I could've just fine-tuned a pre-trained model, but where's the fun in that? This project taught me about:
- Tokenizer training
- Transformer architecture
- Training dynamics
- How LLMs actually generate text
Plus, now I can say I trained a language model from scratch on my laptop. Pretty cool, right?
Future Improvements
Some things I might try:
- Train on more data (maybe the full WikiText dataset)
- Experiment with different model sizes
- Try different tokenizer configurations
- Add instruction tuning
- Fine-tune it for specific tasks
License
MIT - Feel free to use this however you want! Learn from it, break it, improve it. That's what it's here for.
Acknowledgments
Built with:
- π€ Hugging Face Transformers
- PyTorch
- The WikiText dataset
- Too much coffee β
Note: This is a learning project and experimental model. Use it for fun and education, not production systems!
If you found this interesting or helpful, feel free to star the repo or reach out. Always happy to chat about ML stuff!
Last updated: October 05, 2025
- Downloads last month
- 8