-
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper β’ 2505.19443 β’ Published β’ 15 -
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
Paper β’ 2506.19290 β’ Published β’ 53 -
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
Paper β’ 2105.12655 β’ Published -
StarCoder 2 and The Stack v2: The Next Generation
Paper β’ 2402.19173 β’ Published β’ 156
Collections
Discover the best community collections!
Collections including paper arxiv:2402.19173
-
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Paper β’ 2405.07990 β’ Published β’ 20 -
Large Language Models as Planning Domain Generators
Paper β’ 2405.06650 β’ Published β’ 13 -
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation
Paper β’ 2404.12753 β’ Published β’ 43 -
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Paper β’ 2404.07972 β’ Published β’ 52
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper β’ 2508.06471 β’ Published β’ 211 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper β’ 2508.14444 β’ Published β’ 47 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper β’ 2507.06261 β’ Published β’ 67 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper β’ 2506.13585 β’ Published β’ 274
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 78 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 207 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper β’ 2504.01833 β’ Published β’ 23 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 258
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper β’ 2312.00752 β’ Published β’ 150 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper β’ 2206.00364 β’ Published β’ 18 -
GLU Variants Improve Transformer
Paper β’ 2002.05202 β’ Published β’ 5 -
StarCoder 2 and The Stack v2: The Next Generation
Paper β’ 2402.19173 β’ Published β’ 156
-
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper β’ 2505.19443 β’ Published β’ 15 -
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
Paper β’ 2506.19290 β’ Published β’ 53 -
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
Paper β’ 2105.12655 β’ Published -
StarCoder 2 and The Stack v2: The Next Generation
Paper β’ 2402.19173 β’ Published β’ 156
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper β’ 2508.06471 β’ Published β’ 211 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper β’ 2508.14444 β’ Published β’ 47 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper β’ 2507.06261 β’ Published β’ 67 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper β’ 2506.13585 β’ Published β’ 274
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 78 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 207 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper β’ 2504.01833 β’ Published β’ 23 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 258
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper β’ 2312.00752 β’ Published β’ 150 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper β’ 2206.00364 β’ Published β’ 18 -
GLU Variants Improve Transformer
Paper β’ 2002.05202 β’ Published β’ 5 -
StarCoder 2 and The Stack v2: The Next Generation
Paper β’ 2402.19173 β’ Published β’ 156
-
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Paper β’ 2405.07990 β’ Published β’ 20 -
Large Language Models as Planning Domain Generators
Paper β’ 2405.06650 β’ Published β’ 13 -
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation
Paper β’ 2404.12753 β’ Published β’ 43 -
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Paper β’ 2404.07972 β’ Published β’ 52