Instructions to use Tanaybh/nano-gpt-from-scratch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Tanaybh/nano-gpt-from-scratch with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Tanaybh/nano-gpt-from-scratch")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Tanaybh/nano-gpt-from-scratch")
model = AutoModelForCausalLM.from_pretrained("Tanaybh/nano-gpt-from-scratch")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Tanaybh/nano-gpt-from-scratch with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Tanaybh/nano-gpt-from-scratch"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tanaybh/nano-gpt-from-scratch",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Tanaybh/nano-gpt-from-scratch

SGLang

How to use Tanaybh/nano-gpt-from-scratch with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Tanaybh/nano-gpt-from-scratch" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tanaybh/nano-gpt-from-scratch",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Tanaybh/nano-gpt-from-scratch" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tanaybh/nano-gpt-from-scratch",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Tanaybh/nano-gpt-from-scratch with Docker Model Runner:
```
docker model run hf.co/Tanaybh/nano-gpt-from-scratch
```

🤖 Nano GPT - Built From Scratch

Hey there! Welcome to my tiny language model. I built this GPT from scratch as a learning project, and honestly, it was pretty fun watching it learn to generate text!

What is this?

This is a super small GPT-2 style language model that I trained on my laptop. It's not going to write your essays or solve world hunger, but it's a cool demonstration of how these language models actually work under the hood.

Think of it as a baby GPT - it can generate text, but don't expect Shakespeare. More like... an enthusiastic toddler who just learned to talk.

Model Stats

Parameters: ~1,065,728 (yes, that's million with an M, not billion!)
Layers: 4 transformer layers
Embedding Size: 128 dimensions
Attention Heads: 4 heads
Context Length: 128 tokens
Vocab Size: 2000 tokens
Training Data: WikiText-2 (5,000 samples)
Training Time: 10 epochs on my laptop

Quick Start

Want to try it out? Here's how:

from transformers import pipeline

# Load the model
generator = pipeline('text-generation', model='Tanaybh/nano-gpt-from-scratch')

# Generate some text
output = generator(
    "The meaning of life is",
    max_new_tokens=30,
    do_sample=True,
    temperature=0.8
)

print(output[0]['generated_text'])

Example Output

I gave it the prompt: "**The **"

And it generated:

The × 60 munitions, and injuries were found in the taxonomy in the south, the east of the

Not bad for a tiny model trained in a few hours, right?

Training Details

I trained this model from scratch using:

Custom BPE tokenizer (trained on the same data)
GPT-2 architecture (just way smaller)
AdamW optimizer with a learning rate of 0.0005
Batch size of 8
Trained for 10 epochs

The whole thing runs on a regular laptop - no fancy GPU clusters needed!

Limitations

Let's be real here:

This model is TINY. Like, really tiny. It has 1,065,728 parameters vs GPT-3's 175 billion.
It was only trained on 5,000 Wikipedia samples, so its knowledge is... limited.
It might generate weird or nonsensical text sometimes. That's part of the charm!
Maximum context length is only 128 tokens, so don't expect long conversations.
It's a base model with no instruction tuning, so it just continues text rather than following commands.

Why I Made This

I wanted to understand how language models work by building one myself. Sure, I could've just fine-tuned a pre-trained model, but where's the fun in that? This project taught me about:

Tokenizer training
Transformer architecture
Training dynamics
How LLMs actually generate text

Plus, now I can say I trained a language model from scratch on my laptop. Pretty cool, right?

Future Improvements

Some things I might try:

Train on more data (maybe the full WikiText dataset)
Experiment with different model sizes
Try different tokenizer configurations
Add instruction tuning
Fine-tune it for specific tasks

License

MIT - Feel free to use this however you want! Learn from it, break it, improve it. That's what it's here for.

Acknowledgments

Built with:

🤗 Hugging Face Transformers
PyTorch
The WikiText dataset
Too much coffee ☕

Note: This is a learning project and experimental model. Use it for fun and education, not production systems!

If you found this interesting or helpful, feel free to star the repo or reach out. Always happy to chat about ML stuff!

Last updated: October 05, 2025

Downloads last month: 8

Safetensors

Model size

1.07M params

Tensor type

F32

Tanaybh
/

nano-gpt-from-scratch