Dpo
17 items across the graph — tagged with Dpo.
From the graph · 17
Hugging Face model with 13461 likes. Tags: diffusers, safetensors, text-to-image, image-generation, flux, en, license:other, endpoints_compatible, diffusers:Flu…
Hugging Face model with 5267 likes. Tags: diffusers, safetensors, text-to-image, image-generation, flux, en, license:apache-2.0, endpoints_compatible, diffusers…
Hugging Face model with 5141 likes. Tags: transformers, safetensors, deepseek_v4, text-generation, arxiv:2606.19348, license:mit, eval-results, endpoints_compat…
Hugging Face model with 4943 likes. Tags: transformers, safetensors, gpt_oss, text-generation, vllm, conversational, arxiv:2508.10925, license:apache-2.0, eval-…
Hugging Face model with 4759 likes. Tags: transformers, safetensors, gpt_oss, text-generation, vllm, conversational, arxiv:2508.10925, license:apache-2.0, eval-…
Hugging Face model with 4094 likes. Tags: transformers, safetensors, deepseek_v3, text-generation, conversational, custom_code, arxiv:2412.19437, eval-results,…
Hugging Face model with 3918 likes. Tags: diffusers, stable-diffusion, text-to-image, dataset:Nerfgun3/bad_prompt, license:creativeml-openrail-m, endpoints_comp…
Hugging Face model with 3624 likes. Tags: transformers, pytorch, multi_modality, muiltimodal, text-to-image, unified-model, any-to-any, arxiv:2501.17811, licens…
Hugging Face model with 3475 likes. Tags: transformers, safetensors, phi, text-generation, nlp, code, en, license:mit, text-generation-inference, endpoints_comp…
Hugging Face model with 3223 likes. Tags: diffusers, safetensors, stable-diffusion, text-to-image, en, license:creativeml-openrail-m, endpoints_compatible, diff…
🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.
Hugging Face model with 3119 likes. Tags: transformers, safetensors, gemma4, image-text-to-text, conversational, base_model:google/gemma-4-31B, base_model:finet…
Slips, a free software behavioral Python intrusion prevention system (IDS/IPS) that uses machine learning to detect malicious behaviors in the network traffic.…
逐行对照 MiniMind 源码精读、并延伸到大模型技术体系的中文学习笔记 —— 预训练 / SFT / DPO / PPO / GRPO、训练机制、MiniMind2→3 版本对照、真实实验证据。
Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.
Enterprise-grade browser extension bringing multilingual voice interaction to AI chatbots (Pi, Claude, ChatGPT). Features real-time speech detection with Silero…
Research platform for model training, evaluation, and experimentation across architectures, benchmarks, and recipes.
