Deqing Fu PRO
deqing
AI & ML interests
None yet
Recent Activity
updated a collection about 3 hours ago
Fourier Language Model liked a model about 3 hours ago
deqing/fone-llama-3.2-1B-fineweb-sample-100BT-fone3d-hybrid-tile-v4 updated a model about 4 hours ago
deqing/llama-300M-v5-window_2Organizations
Convergent Evolution (Addition)
-
deqing/convergent-llama-300M-muon-addition_3digit
Text Generation • 0.3B • Updated • 983 -
deqing/convergent-llama-300M-muon-addition_3digit_seed123
0.3B • Updated • 285 -
deqing/convergent-llama-300M-muon-addition
Text Generation • 0.3B • Updated • 1.91k -
deqing/convergent-llama-300M-adamw-addition_3digit
Text Generation • 0.3B • Updated • 816
Convergent Evolution (Data)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 725 -
deqing/convergent-llama-300M-muon-unigram
Text Generation • 0.3B • Updated • 232 -
deqing/convergent-llama-300M-muon-isolate
Text Generation • 0.3B • Updated • 270 -
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation • 0.3B • Updated • 265
Convergent Evolution
Convergent Evolution (Architecture and Optimizer)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 725 -
deqing/convergent-gdn-300M-muon-original
Text Generation • 0.3B • Updated • 349 -
deqing/convergent-mamba2-300M-muon-original
Text Generation • 0.3B • Updated • 310 -
deqing/convergent-lstm-4layer-muon-original
Text Generation • 0.2B • Updated • 360
Fourier Language Model
Convergent Evolution
Convergent Evolution (Addition)
-
deqing/convergent-llama-300M-muon-addition_3digit
Text Generation • 0.3B • Updated • 983 -
deqing/convergent-llama-300M-muon-addition_3digit_seed123
0.3B • Updated • 285 -
deqing/convergent-llama-300M-muon-addition
Text Generation • 0.3B • Updated • 1.91k -
deqing/convergent-llama-300M-adamw-addition_3digit
Text Generation • 0.3B • Updated • 816
Convergent Evolution (Architecture and Optimizer)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 725 -
deqing/convergent-gdn-300M-muon-original
Text Generation • 0.3B • Updated • 349 -
deqing/convergent-mamba2-300M-muon-original
Text Generation • 0.3B • Updated • 310 -
deqing/convergent-lstm-4layer-muon-original
Text Generation • 0.2B • Updated • 360
Convergent Evolution (Data)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 725 -
deqing/convergent-llama-300M-muon-unigram
Text Generation • 0.3B • Updated • 232 -
deqing/convergent-llama-300M-muon-isolate
Text Generation • 0.3B • Updated • 270 -
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation • 0.3B • Updated • 265