ToyLlama 50M

Third version of ToyLlama model. See more ToyLlamas in my profile.

(P.S. this one was intended to be 30M v2, but due to increase of data, I choose to use pretrained openai-community/gpt2 tokenizer)

Example generations

All were generated with test.py interactive CLI. (Usage: python3 test.py)

>>> Enter prompt: Once upon a time, the cow walked to the park
------------------------------------------------------------
Once upon a time, the cow walked to the park in the west of the town of Pippa. It was the first of the town's most popular and popular sports. The town was renamed a "Woo" in January 2019.
The town's name was changed to "Doves of the Year" in May 2021.
References
External links
1952 births
Living people
American male football players
American football pitchers
21st-century American football players
21st-century American football players
People from Bambisha, New York
Canadian men's footballers
New York City footballers
Basketball players from New York (state)
Players of American football from New York (state)
People from Santa Cruz, New York (state)
Australian
------------------------------------------------------------
>>> Enter prompt: Abraham Lincoln
------------------------------------------------------------
Abraham Lincoln, the first of the first students of the National Academy of Sciences, the first of the first students of the National Academy of Sciences. 
The university was founded in 2005 by the National Academy of Sciences in 2006.
The first student school of the university was named for the first student school in the University of Chicago in 2006. The college is named after the college.
The college is located in the north-central corner of the campus.  The school is located in the south-west corner of the campus.
The school is located in the centre of the city, and is now part of the University of Minnesota.
See also
List of the School of Arts and Sciences
References
External links
 University of Wisconsin at University
------------------------------------------------------------

(as you see, it became slightly wikipedian)

Training information

Trained for 26 hours on one RX 6600 (8GB VRAM).

Parameter Value
Loss 2.5885
Epoches 1
grad_norm 0.5695
Learning rate 5e-4
Batch size 8
Gradient accumulation steps 4
Training tokens 13763555964 (~13B)

Training data

Training data was HF datasets + multiple other sites (2262401 files total):

$ du -sh ./data/*
12K	./data/cows.info.gf
472K	./data/github.com
5,8M	./data/gutenberg.org
88K	./data/habr.com
9,0G	./data/huggingface.co
474M	./data/textfiles.com
12K	./data/www.reddit.com
$ du -sh ./data/
9,5G	./data/
$ du -sh ./data/huggingface.co/*/*
34M	./data/huggingface.co/cornell-movie-review-data/rotten_tomatoes
21M	./data/huggingface.co/EleutherAI/lambada_openai
5,8M	./data/huggingface.co/gaianet/bible_bot
92K	./data/huggingface.co/gaianet/paris
344K	./data/huggingface.co/gaianet/trumpVSharris
16K	./data/huggingface.co/krinal/fifa_2022
680M	./data/huggingface.co/prayslaks/wikimedia_wikipedia_100K
8,2G	./data/huggingface.co/roneneldan/TinyStories
68M	./data/huggingface.co/stas/openwebtext-10k

Advanced module information

Detailed information about EXACT model.

Parameter Value
Hidden size 512
Intermediate size 1536
Hidden layers 8
Attention heads 8
Key value heads 4
Total parameters 50906112
Downloads last month
13
Safetensors
Model size
50.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sapbot/toyllama-50m

Quantizations
1 model

Datasets used to train sapbot/toyllama-50m

Collection including sapbot/toyllama-50m