Language Models
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
January 22, 2025
-
DeepSeek-V3 Technical Report
December 27, 2024
-
ModernBERT - Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
December 18, 2024
-
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
November 22, 2024
-
Gemma 2: Improving Open Language Models at a Practical Size
July 31, 2024
-
The Llama 3 Herd of Models
July 31, 2024
-
Apple Intelligence Foundation Language Models
July 29, 2024
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
May 7, 2024
-
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
February 5, 2024
-
DeepSeek-Coder: When the Large Language Model Meets Programming - The Rise of Code Intelligence
January 25, 2024
-
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
January 11, 2024
-
Mixtral of Experts
January 8, 2024
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
January 5, 2024
-
Mistral 7B
October 10, 2023
-
Llama 2: Open Foundation and Fine-Tuned Chat Models
July 18, 2023
-
LLaMA: Open and Efficient Foundation Language Models
February 27, 2023
Retrieval Augmented Generation