Talkie-1930-13B-IT

Item: Talkie-1930-13B-IT
Rating: 2
Author: Ali Madjaji

Talkie-LM (research)

Per the published model card, Talkie-1930-13B-IT is an Apache 2.0 instruction-tuned 13B model trained exclusively on pre-1931 English text (260B tokens, sourced from public-domain reference works). The training-data transparency is unusually clean for AI Act Article 53 purposes; the limits are vendor jurisdiction (a US-affiliated research collaboration with no published EU DPA) and the deliberate vintage corpus, which makes the model unsuitable for any task requiring post-1931 factual knowledge.

Sovereignty

Licence: Apache 2.0Commercial: UnrestrictedTraining data: DocumentedOrigin: US (research)

Licence facts

Parameters: 13B (instruction-tuned)
Base model: talkie-1930-13b-base
Training corpus: 260B tokens of pre-1931 English text (Internet Archive sources)
Knowledge cutoff: Pre-1931 (by design)
Post-training: SFT on pre-1931 instruction pairs + online DPO with LLM-as-judge

Performance & pricing?

Known risks

Knowledge cutoff is pre-1931 by construction — the model has no awareness of any event, person, technology, terminology or norm from the last ~95 years and will confabulate or refuse on modern queries.
Project is a research collaboration (authors affiliated with Anthropic and Coefficient Giving) with no formal corporate entity; no GDPR DPA, AI Act compliance file or vendor SLA is published.
Pre-1931 corpus inherits the period's well-documented racial, colonial and gender biases; outputs require strong moderation before any user-facing deployment.

Sources

→ Model Card → Project site

Compare with

vsMiMo-V2.5 vsMiMo-V2.5-Pro vsKimi-K2.6 vsQwen3.6-27B

Reviewed by Ali Madjaji · Last reviewed 2026-04-28· Reviewed 0 days agoSuggest a correction