this post was submitted on 20 Jan 2025
57 points (90.1% liked)
Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ
55713 readers
516 users here now
⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.
Rules • Full Version
1. Posts must be related to the discussion of digital piracy
2. Don't request invites, trade, sell, or self-promote
3. Don't request or link to specific pirated titles, including DMs
4. Don't submit low-quality posts, be entitled, or harass others
Loot, Pillage, & Plunder
📜 c/Piracy Wiki (Community Edition):
💰 Please help cover server costs.
Ko-fi | Liberapay |
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Weight leaks for semi-open models have been fairly common in the past. Meta's LLaMa1.0 model was originally closed source, but the weights were leaked and spread pretty rapidly (effectively laundered through finetunes and merges), leading to Meta embracing quasi-open source post-hoc. Similarly, most of the anime-style Stable Diffusion 1.5 models were based on NovelAI's custom finetune, and the weights were similarly laundered and became ubiquitous.
Those incidents were both in 2023. Aside from some of the biggest players (OpenAI, Google, Anthropic, and I guess Apple kinda), open weight releases (usually not open source) have been become the norm (even for frontier models like DeepSeek-V3, Qwen 2.5 and Llama 3.1), so piracy in that case is moot (although it's easy to assume that use non-compliant with licenses is also ubiquitous). Leakage of currently closed frontier models would be interesting from an academic and journalistic perspective, for being able to dig into the architecture and assess things like safety and regurgitation outside of the online service shell, but those frontier models would require so much compute that they'd be unusable by individual actors.