CCleanSource

The Library

Every text is public-domain, human-written, and pre-2022. Download any as a clean, RAG-ready bundle — 5 free per month.

17 texts

Philosophy & Stoicism180

Meditations

Marcus Aurelius

The private notebook of a Roman emperor, written to himself as Stoic self-correction. Short, aphoristic entries on duty, mortality, anger, and accepting what is outside one's control. A clean, dense source of moral-reasoning prose untouched by modern editorializing.

#stoicism#philosophy#ethics
Philosophy & Stoicism375 BCE

The Republic

Plato

Plato's dialogue on justice, the ideal state, and the nature of knowledge, framed by the allegory of the cave. Dialectic question-and-answer structure makes it a rich corpus for reasoning-chain and argument-modeling tasks.

#philosophy#political-theory#dialectic
Philosophy & Stoicism1883

Thus Spake Zarathustra

Friedrich Nietzsche

Nietzsche's philosophical novel in prophetic, poetic prose — the Übermensch, eternal recurrence, and the revaluation of values, delivered as parable. Distinctive stylistic register, valuable for tone and rhetoric modeling.

#philosophy#existentialism#prose-poetry
Philosophy & Stoicism1886

Beyond Good and Evil

Friedrich Nietzsche

A critique of past philosophers and traditional morality, arguing for a philosophy of the future grounded in the will to power. Tight argumentative aphorisms across 296 numbered sections — well-chunked for retrieval.

#philosophy#ethics#epistemology
Science & Reason1859

On the Origin of Species

Charles Darwin

Darwin's foundational argument for evolution by natural selection, built from patient observation and careful inductive reasoning. A model of evidence-driven scientific prose for grounding RAG over primary-source science.

#science#biology#evolution
Science & Reason1916

Relativity: The Special and General Theory

Albert Einstein

Einstein's own popular exposition of special and general relativity, written for the general reader. Clear explanatory structure with worked thought-experiments — useful for technical-explanation and pedagogy datasets.

#science#physics#relativity
Economics & Society1776

The Wealth of Nations

Adam Smith

The founding text of modern economics: division of labour, markets, value, and the 'invisible hand'. Long-form analytical prose with extended examples — a large, coherent corpus for economic-reasoning grounding.

#economics#markets#political-economy
Economics & Society1859

On Liberty

John Stuart Mill

Mill's defense of individual liberty against social and governmental coercion, including the harm principle and the case for free speech. Crisp, structured argumentation ideal for stance and reasoning tasks.

#philosophy#political-theory#liberty
Economics & Society1848

The Communist Manifesto

Karl Marx and Friedrich Engels

The concise political pamphlet outlining historical materialism and the class struggle. Short, rhetorically dense — a compact primary source for political-theory and persuasion corpora.

#political-theory#economics#history
Literature & Fiction1813

Pride and Prejudice

Jane Austen

Austen's comedy of manners following Elizabeth Bennet and Mr. Darcy. Famous for free indirect discourse and razor-sharp dialogue — a benchmark corpus for narrative voice, irony, and character modeling.

#fiction#romance#dialogue
Literature & Fiction1818

Frankenstein; or, The Modern Prometheus

Mary Wollstonecraft Shelley

The originating work of science fiction: a nested, epistolary tale of creation, ambition, and abandonment. Multiple narrators and frames make it useful for perspective and discourse-structure modeling.

#fiction#science-fiction#gothic
Literature & Fiction1851

Moby-Dick; or, The Whale

Herman Melville

Ishmael's voyage aboard the Pequod under the obsessed Captain Ahab. Encyclopedic in register — narrative, technical cetology, sermon, soliloquy — a stress-test corpus for long-context retrieval and genre-shift.

#fiction#adventure#symbolism
Literature & Fiction1892

The Adventures of Sherlock Holmes

Arthur Conan Doyle

Twelve self-contained detective stories featuring Holmes and Watson. Clean problem→deduction→resolution structure across discrete episodes — excellent for reasoning, QA, and chunked-retrieval evaluation.

#fiction#mystery#deduction
Literature & Fiction1897

Dracula

Bram Stoker

The epistolary horror novel assembled from journals, letters, and telegrams. Its multi-document construction makes it a natural fit for source-attribution and timeline-reconstruction tasks.

#fiction#gothic#horror
Literature & Fiction1890

The Picture of Dorian Gray

Oscar Wilde

Wilde's only novel: a study of aestheticism, vanity, and moral decay, dense with epigram. A rich source of quotable, stylistically distinctive English prose.

#fiction#philosophical#epigram
Literature & Fiction1865

Alice's Adventures in Wonderland

Lewis Carroll

Carroll's logical-nonsense classic. Short, surreal, and full of wordplay and rule-bending dialogue — a compact corpus for language-play, humor, and edge-case parsing.

#fiction#fantasy#wordplay
Literature & Fiction1859

A Tale of Two Cities

Charles Dickens

Dickens's novel of London and Paris during the French Revolution. Sweeping historical narrative with one of the most recognizable openings in English — strong for narrative-arc and sentiment modeling.

#fiction#historical#revolution