Reference
Methodology
How Rodin reads, measures, and matches
Rodin rests on three claims.
The first is essence.
How you structure thought leaves a signature in your writing that persists when topic, vocabulary, and register all change. We read that signature — not what you write about.
The second is discovery.
The relationship between two minds — whether they think alike or productively contend — exists in cognitive space before anyone looks for it. Rodin doesn’t build the connection. It reveals one that was already there.
The third is stability.
A fingerprint is real to the degree it holds its shape across samples. If the same pattern emerges from three different pieces of your writing, the pattern is yours, not the text’s. The rest of this page documents, in full, how the signature is extracted, and how two minds are judged close.
- thinkers
- 0
- kinships
- 0
- avg degree
- 0.0
live · revalidated hourly
Part One
The Process
Paste your writing
Upload an Obsidian vault, paste a Notion export, or drop in anything you've written — notes, essays, journals, threads. A few thousand words is a good start. More is better.
The model reads how you think
A single pass across your corpus. Not topic extraction — pattern extraction. What you keep returning to, what you circle without resolving, whose frameworks you inherited, where your attention refuses to go.
Your fingerprint arrives
Eight structured layers: archetype, one-liner, recurring themes, open questions, mental models, intellectual DNA, blind spots, and the single core question driving the whole body of work.
A public profile goes live
Add your name and a handle. Your page lives at rodin.fyi/p/[id] — shareable as a link, downloadable as a card, legible at a glance.
The map finds its peers
Your cognitive vector is compared against every other profile. The closest minds surface on your page. The connection starts there.
Seen enough? Try it on your own writing.
Read My Fingerprint →Part Two
The Method
What Rodin measures
Every profile carries two layers. Both are derived from the same text you paste.
- The fingerprint. Eight structured fields — themes, open questions, mental models, intellectual DNA, blind spots, core question, one-liner, archetype. Generated by a large language model against a fixed, versioned prompt.
- The cognitive topology. A 12-dimensional vector describing how — not what — the writer reasons: epistemic style, temporal stance, reasoning mode, dialectical style, abstraction preference, concreteness, first-principles tendency, systems thinking, empirical-vs-rationalist pull, tolerance for ambiguity, rhetorical mode, and disciplinary breadth.
The fingerprint is what you read. The topology is what the matcher reads.
How extraction works
Your writing is streamed to a large language model as a single request with a system prompt that instructs the model to return each fingerprint field as a discrete NDJSON event. The client receives these events in order — themes first, then questions, then models, then DNA, and so on — and renders them as they arrive. This is not a UI trick. It is the actual response shape.
After the eight fields resolve, the same text is passed through a second pass — the CTA engine — which computes the 12-dimensional cognitive vector deterministically from a set of internal features. The vector is not generated by any language model. It is computed. This is the load-bearing distinction.
Your source writing is never stored. Only the derived fingerprint and the computed vector are retained.
How matching works
Two profiles' cognitive vectors are compared by cosine distance in 12-dimensional space. The /api/similar endpoint returns the nearest neighbors by this distance, and a mutual-match threshold is applied when both profiles appear in each other's top-k.
Cosine was chosen deliberately over Euclidean distance. Euclidean over-weights magnitude — two writers with the same shape of mind but different verbosity would look different. Cosine reads shape, not volume.
We do not match on shared themes, shared DNA, or surface keyword overlap. Those are often noisier than the underlying geometry.
How prose distance is measured
Cognitive geometry reads how a writer reasons. Stylometry reads how a writer’s sentences move — a different question that picks up on a different layer of authorial signature. Rodin computes both, separately, and surfaces stylometric kin alongside cognitive ones.
For each piece of writing we extract a 50-dimensional function-word profile: the relative frequency, per token, of the fifty most common English function words — the, of, and, that, and so on. Function words are the load-bearing instrument of stylometry because their frequency is largely independent of topic. A philosopher writing about ethics and the same philosopher writing about aesthetics will use different content vocabulary, but the rate at which they reach for which over that, or moreover over but, stays remarkably consistent.
Distance between two profiles is computed by Burrows’ Delta— the mean absolute z-score difference between the two function-word vectors, standardized against the canon corpus mean and standard deviation. Lower delta means more stylistically similar prose. Same-author writing typically falls below 0.55 (we call this “kindred”); same-period contemporaries cluster in 0.55–0.85 (“close”); writers from different traditions or centuries spread out above 1.25 (“distant”).
The reference statistics are computed once over the historical canon and baked into the codebase. The 8-dim normalized stylometric vector that appears on profile pages is a separate, interpretable surface — rhythm, register, syntax, density — and is not what the matching engine reads. The matching engine reads the 50-dim profile.
The substrate this fingerprint reads
The twelve dimensions Rodin measures are real but parochial. First-principles reasoning — dimension 07 — is computed against a lexicon of markers like by definition, axiomatically, from first principles, deductively, tautologically. Experiential reference — dimension 08 — keys on I noticed, I observed, personally, lived experience: first-person individual experience, not communal. Abstraction level — dimension 11 — leans on ontology, epistemology, phenomenology, teleological, axiom: the vocabulary of one philosophical tradition. Authority reference — dimension 06 — keys on according to, as X argues, in the tradition of: Western academic citation patterns. These markers are not abstract universal cognitive features. They are the legible cognitive shape of English-language analytic prose.
This is structural, not statistical. A reader can verify it without rerunning anything: the lexicons live in the source code, and the source code is not hiding what it knows. The shape we measure is the shape that one tradition’s prose has trained itself to perform legibly.
We tested what this means in practice. Six excerpts from the Stanford Encyclopedia of Philosophy — three on African philosophy (the Akan account of personhood, the philosophic-sage tradition, character-based African ethics), three on Cartesian metaphysics (Descartes’ epistemology, his modal philosophy, Cartesian dualism) — were read by the same engine that reads visitor profiles. The narrative labels split cleanly along substrate. All three Cartesian essays produced “First-principles builder.” All three African essays produced “Authority-referencing.” But the African texts are not less foundational; they reason from communal practice, from proverb-and-lineage citation, from the constitutive entanglement of self and ancestor. The engine cannot name that mode of reasoning, so it routes the reading through the closest Cartesian-shaped vocabulary item available — flattening a substrate-distinctive pattern into the label nearest at hand.
The numeric apparatus on this test is illustrative, not evidential. The 12-dim engine is noisy at sub-2,000-word lengths; splitting any one of these essays in half produces enough divergence to discount per-essay vector readings. The narrative-label finding survives because it is qualitative — a reader can verify the misnaming by reading the output. The stylometric layer described above does not have the same parochialism (function-word distributions cluster a single author’s prose tightly across topics), but it also does not name what the writer is reasoning about. Each layer reads what it reads.
A separate causal check — the identifiability atlas— takes each of the twelve dimensions in turn, removes the lexicon that drives it from a thirty-essay corpus, and reports how far the score, the geometry, and the public labels move. Strong rows tell you a dimension genuinely measures what its name claims; weak rows tell you the lexicon barely fires on prose of that register. It is the auditable companion to the structural argument here.
So we name the boundary plainly: the v1 fingerprint matches you most accurately when your writing operates within English-analytic register. If your generative substrate is Carl Mika’s worldedness — the entanglement of person, ancestor, place, and language as constitutive of thought — or Edwin Etieyibo’s ubuntu, the self as constitutively communal, or the Datong and Tai-ping utopian traditions, or the Australian Aboriginal Dreaming as a temporal ontology, the v1 fingerprint will read part of you and miss part of you. We say which part.
In Korean, 안돼 carries a moral weight that English no does not carry. If your inner monologue runs partly in Korean, the English fingerprint gives you a measurement of how you think in English. That is one true answer to a more complicated question.
V2 will surface a multi-substrate axis. V1 names the boundary instead of pretending it isn’t there.
How citation works
Geometric matching is the first kind of edge between profiles. Citation is the second. Where matching is inferred — computed from the writing without anyone's consent or knowledge — citation is declared. A profile owner picks another profile, with or without a 280-character note, and the edge becomes part of the public graph.
Both edges live on every profile page, and they answer different questions. The match strip answers whose mind looks like this one in cognitive space. The citers strip answers which minds publicly carry this one's work forward, and the reading list answers which minds compose this one's intellectual lineage. Geometry can be wrong about kinship; citation cannot, because the citer is putting their name on it.
The schema is minimal: a directed edge from citer to cited, with an optional note, a timestamp, and a unique constraint preventing duplicate (citer, cited) pairs. There is no "follow," no notification, no count divorced from identity. The follower count is a metric the cited person learns nothing from; the citers strip is a list of named minds, each with their own archetype and cognitive signature visible.
Mutual citation — when A cites B and B cites A independently, with no exchange or prompt — surfaces as a "◈ Mutual citation" badge on both profiles. It is one of the few signals on the open web that resists gaming, because the cost of a counterfeit citation is a public reputational claim made under the citer's own name and shape of mind.
Citation is what makes the graph addressable to people who never created a Rodin profile in the first place. A historical canon entry can be cited; an unclaimed seed profile can be cited; the cited side does not need to be online or even alive. The act is the citer's, and the value flows in both directions: the cited gains durable public attribution, and the citer publishes a piece of their reading life.
How graph roles are computed
Every profile sits inside a similarity graph built across the full population. For each profile we compute an adaptive cosine threshold — calibrated so the expected degree scales with log₂(N) × 2 — and then run two structural measures:
- Betweenness centrality (Brandes' algorithm, O(V·E)). Measures how often a node lies on shortest paths between other nodes. High betweenness means a profile bridges otherwise-disconnected clusters.
- Clustering coefficient. The local density of a profile's neighborhood — how many of its neighbors are also neighbors of each other. High clustering means the profile sits in a tight, mutually-reinforcing group.
Roles are assigned by empirical percentiles, not arbitrary cutoffs:
- Nexus — high betweenness, low clustering. A bridge between communities.
- Core — high clustering, central position. Reinforces and extends a dense cluster.
- Territorial — dense local neighborhood, low reach. Deep in one community.
- Satellite — low degree, sparse neighborhood. On the edge of the map.
- Isolate — degree ≤ 1. Structurally alone.
- Generalist — unremarkable on every axis. Most profiles.
These are computed nightly and cached for six hours.
How rarity is computed
For each of the 12 cognitive dimensions, a profile's value is ranked against the full population to produce a two-tailed empirical p-value. The negative log of that p-value — in nats — is the dimension's surprisal: how unexpected your position is along that axis.
The 12 surprisals are averaged to produce a joint surprisal. We deliberately do not sum them (Fisher's method) because the dimensions are correlated. Summing would inflate the appearance of rarity. Averaging preserves honesty.
P-values are floored at 1/(N+1) so that a single outlier cannot claim infinite rarity. The percentile of your joint surprisal against the population is the final reported figure.
The approach is rank-based and nonparametric. We do not assume any dimension is Gaussian. Most aren't.
How stability is measured
A fingerprint is provisional until it proves it can hold its shape across samples. When a profile has more than one corpus, we compute its stability as the mean pairwise cosine similarity across every pair of cognitive signature vectors (history + current). With N samples, that's N(N-1)/2 pairs averaged into a single number.
With one sample, stability is undefined — there are no pairs to compare. The profile displays as provisional. With two or more samples, we assign a band:
- Emerging — stability below 0.75. A pattern is starting to appear, but there's still meaningful drift between submissions.
- Stable — 0.75 to 0.9. The mean pairwise agreement is high. Submissions disagree on the margins but agree on the core.
- Confirmed — 0.9 and above. Submissions are essentially repeating the same pattern in different language. The signature is very likely real.
Stability is the mathematical shape of the stability-equals-reality claim the method rests on. It is not a quality score about the writing — it is a measurement of how repeatable the underlying signal is across different samples of that writing. A brilliant writer with one submitted sample is still provisional. A journeyman writer with five consistent submissions is confirmed.
How drift is measured
When a profile has multiple fingerprint snapshots across time, we compute a geometric summary of how its mind has moved:
- Drift vector — last position minus first, in normalized space.
- Magnitude — the Euclidean length of that vector. How far you have net-moved.
- Principal axis — the single dimension that shifted most, and its sign.
- Path length — the sum of all step magnitudes. Total motion, not just net displacement.
- Linearity — magnitude divided by path length. 1.0 is a straight line; low values indicate wandering that returned near the origin.
- Direction stability — mean cosine between consecutive step vectors. 1.0 is steady drift, 0 is random walk, negative values are oscillation.
Magnitude alone is misleading. A profile can drift very little net but have traveled far — that's exploratory motion, not settled change. The distinction matters when reading the timeline.
The twelve dimensions
The cognitive vector's twelve axes, with a one-line definition for each. The coefficients that compute them are not public; the definitions are.
- 01
Epistemic confidence
assertion vs hedging — how firmly claims are stated
- 02
Epistemic diversity
range of stances entertained within one piece of writing
- 03
Temporal orientation
past-looking vs future-looking attention
- 04
Argument density
claims per sentence, controlling for length
- 05
Conceptual leap
average semantic distance between adjacent ideas
- 06
Authority reference
frequency of appeals to named thinkers or canonical texts
- 07
First-principles reasoning
building from axioms rather than precedent
- 08
Experiential reference
appeals to personal experience as evidence
- 09
Evidential reference
appeals to data, studies, or empirical record
- 10
Dialectical complexity
thesis/antithesis/synthesis patterns
- 11
Abstraction level
abstract vs concrete language ratio
- 12
Intellectual tempo
sentence-length variation as rhythm
Validation against canon pairs
A method this opinionated has to be falsifiable. Rodin maintains a fixture of 27 canonical pairs— documented intellectual relationships drawn from the textual record — and runs each one through the live matching geometry every 24 hours. For every pair (A, B), we ask: where does B rank among the rest of the archive when scored against A? A random metric would place each partner at a median rank near the midpoint. A working metric should place them well below.
Three independent metrics are evaluated — the production cosine ranker, the Jaccard + topology blend, and Burrows’ Delta on the function-word profile — alongside negative controls (unrelated thinkers from different eras and disciplines) which should sit near the random baseline rather than below it.
The full table, with per-pair ranks across all three metrics, lives at /methodology/validation. It is the public version of the test we use internally to know whether a change to the matcher made it sharper or duller. A second public surface — the identifiability atlas — shows, dimension by dimension, what each axis actually measures by intervening on the lexicon that drives it. A third — the invariance study— runs each of the twelve dimensions through a test-retest comparison across an author’s separate works, on a corpus deliberately chosen to span genre, and reports honestly which dimensions clear a 95% bootstrap floor and which don’t. One does. The asymmetry is the result.
Roadmap
Two learned components now sit alongside the hand-specified geometry. A contrastive projection, trained on human-labeled pairs of “these two minds actually fit” versus “these don’t,” is in production behind the matching engine; its weights are shipped as a static asset and run as TypeScript inference, not a Python service. Burrows’ Delta on function-word profiles surfaces stylometric kin as a separate band of relatedness from the cognitive one.
What’s next: extending the canonical-pair fixture to several hundred labeled pairs so the validation harness becomes a regression test rather than a sanity check; surfacing stylometric kin on the map alongside cognitive matches; and a self-position marker on the population map so a writer can see, at a glance, where their geometry actually sits.
The interpretable vector isn’t going away. It is what makes the profile page legible. The learned components are what make the matches sharper.
What we do not disclose
Rodin publishes its methodology — but not every coefficient. Specifically, the feature weights inside the topology engine, the exact definitions of the 12 dimensions, and the internal calibration constants are kept private. The moat is not the math; the math is here. The moat is the particular choice of features, the training data, and the thousands of small decisions that produced an engine that feels right rather than arbitrary.
If you would like to audit a specific result — your profile role, your rarity score, your drift — email support@rodin.fyi and we will walk you through your numbers.
A note on responsibility
The stability loop has a philosophical edge we should name. Nothing on your profile is assigned to you. The archetype, the questions, the blind spots, the vector — all of it is read from writing you chose to produce. If the reading feels wrong, you can refuse it, but you cannot blame it on the tool. The tool only reports what the corpus says.
And the corpus is revisable. Submit again next month, next year, after a period you’ve actually changed through, and the new reading will reflect that change or it won’t — which is itself useful information. Stability is not a verdict. It is a measurement of how repeatable your current pattern is, right now, in the writing you have so far produced. Your next thousand words can move the line.
This is the opposite of a personality test. A personality test tells you what you are; Rodin reports what you have written. The responsibility for the pattern stays with the writer — which is also where the freedom to change it lives.
Part Three
Questions
- What writing should I paste?
- Anything you've written yourself: Obsidian vaults, Notion exports, blog posts, essays, journal entries, or long threads. At least a few thousand words gives the model enough signal to find real patterns rather than surface keywords.
- How accurate is the fingerprint?
- Accuracy depends on the volume and variety of your writing. Most people find the themes and questions immediately recognizable. The blind spots section tends to be the most surprising. It is an interpretation, not a clinical assessment.
- Is my writing stored?
- No. Your source text is discarded after the fingerprint is computed. Only the derived fingerprint and the computed 12-dimensional vector are retained.
- Can I delete my profile?
- Email support@rodin.fyi with your profile URL. We will remove it within 30 days.
- What does "similar thinkers" mean?
- We compute a 12-dimensional cognitive vector for every profile and compare by cosine distance. The closest vectors are shown on your page — ordered by geometry, not shared keywords. Proximity in this space means intellectual relatedness, which spans two axes: aligned peers (minds that think like yours) and worthy counterparts (minds that productively contend with yours). Both are real kinds of kinship, and the geometry recovers both.
- What does it mean to cite another profile?
- A citation is a public, directed edge from your profile to another, with an optional 280-character note explaining why that mind matters to your reading. The cited profile gains durable, named attribution; you build a public reading list — the lineage you carry forward — that itself becomes part of your fingerprint. Citation needs no owner present on the cited side: historical canon entries and unclaimed seeds can be cited as readily as live profiles. When two profiles cite each other independently, both surface a "◈ Mutual citation" badge — one of the few signals on the open web that cannot be gamed, since the cost of a counterfeit citation is a public claim made under your own name.