Reference

Methodology

How Rodin reads, measures, and matches

Rodin is built on a simple claim: the way a person structures thought leaves a measurable signature in their writing, and that signature is more honest than a résumé. This page documents, in full, how the signature is extracted and how two minds are judged close.

thinkers
0
kinships
0
avg degree
0.0

live · revalidated hourly

Part One

The Process

  1. Paste your writing

    Upload an Obsidian vault, paste a Notion export, or drop in anything you've written — notes, essays, journals, threads. A few thousand words is a good start. More is better.

  2. The model reads how you think

    A single pass across your corpus. Not topic extraction — pattern extraction. What you keep returning to, what you circle without resolving, whose frameworks you inherited, where your attention refuses to go.

  3. Your fingerprint arrives

    Eight structured layers: archetype, one-liner, recurring themes, open questions, mental models, intellectual DNA, blind spots, and the single core question driving the whole body of work.

  4. A public profile goes live

    Add your name and a handle. Your page lives at rodin.fyi/p/[id] — shareable as a link, downloadable as a card, legible at a glance.

  5. The map finds its peers

    Your cognitive vector is compared against every other profile. The closest minds surface on your page. The connection starts there.

Part Two

The Method

What Rodin measures

Every profile carries two layers. Both are derived from the same text you paste.

  • The fingerprint. Eight structured fields — themes, open questions, mental models, intellectual DNA, blind spots, core question, one-liner, archetype. Generated by a large language model against a fixed, versioned prompt.
  • The cognitive topology. A 12-dimensional vector describing how — not what — the writer reasons: epistemic style, temporal stance, reasoning mode, dialectical style, abstraction preference, concreteness, first-principles tendency, systems thinking, empirical-vs-rationalist pull, tolerance for ambiguity, rhetorical mode, and disciplinary breadth.

The fingerprint is what you read. The topology is what the matcher reads.

How extraction works

Your writing is streamed to a large language model as a single request with a system prompt that instructs the model to return each fingerprint field as a discrete NDJSON event. The client receives these events in order — themes first, then questions, then models, then DNA, and so on — and renders them as they arrive. This is not a UI trick. It is the actual response shape.

After the eight fields resolve, the same text is passed through a second pass — the CTA engine — which computes the 12-dimensional cognitive vector deterministically from a set of internal features. The vector is not generated by any language model. It is computed. This is the load-bearing distinction.

Your source writing is never stored. Only the derived fingerprint and the computed vector are retained.

How matching works

Two profiles' cognitive vectors are compared by cosine distance in 12-dimensional space. The /api/similar endpoint returns the nearest neighbors by this distance, and a mutual-match threshold is applied when both profiles appear in each other's top-k.

Cosine was chosen deliberately over Euclidean distance. Euclidean over-weights magnitude — two writers with the same shape of mind but different verbosity would look different. Cosine reads shape, not volume.

We do not match on shared themes, shared DNA, or surface keyword overlap. Those are often noisier than the underlying geometry.

How graph roles are computed

Every profile sits inside a similarity graph built across the full population. For each profile we compute an adaptive cosine threshold — calibrated so the expected degree scales with log₂(N) × 2 — and then run two structural measures:

  • Betweenness centrality (Brandes' algorithm, O(V·E)). Measures how often a node lies on shortest paths between other nodes. High betweenness means a profile bridges otherwise-disconnected clusters.
  • Clustering coefficient. The local density of a profile's neighborhood — how many of its neighbors are also neighbors of each other. High clustering means the profile sits in a tight, mutually-reinforcing group.

Roles are assigned by empirical percentiles, not arbitrary cutoffs:

  • Nexus — high betweenness, low clustering. A bridge between communities.
  • Core — high clustering, central position. Reinforces and extends a dense cluster.
  • Territorial — dense local neighborhood, low reach. Deep in one community.
  • Satellite — low degree, sparse neighborhood. On the edge of the map.
  • Isolate — degree ≤ 1. Structurally alone.
  • Generalist — unremarkable on every axis. Most profiles.

These are computed nightly and cached for six hours.

open map →
current corpus · UMAP projection

How rarity is computed

For each of the 12 cognitive dimensions, a profile's value is ranked against the full population to produce a two-tailed empirical p-value. The negative log of that p-value — in nats — is the dimension's surprisal: how unexpected your position is along that axis.

The 12 surprisals are averaged to produce a joint surprisal. We deliberately do not sum them (Fisher's method) because the dimensions are correlated. Summing would inflate the appearance of rarity. Averaging preserves honesty.

P-values are floored at 1/(N+1) so that a single outlier cannot claim infinite rarity. The percentile of your joint surprisal against the population is the final reported figure.

The approach is rank-based and nonparametric. We do not assume any dimension is Gaussian. Most aren't.

How drift is measured

When a profile has multiple fingerprint snapshots across time, we compute a geometric summary of how its mind has moved:

  • Drift vector — last position minus first, in normalized space.
  • Magnitude — the Euclidean length of that vector. How far you have net-moved.
  • Principal axis — the single dimension that shifted most, and its sign.
  • Path length — the sum of all step magnitudes. Total motion, not just net displacement.
  • Linearity — magnitude divided by path length. 1.0 is a straight line; low values indicate wandering that returned near the origin.
  • Direction stability — mean cosine between consecutive step vectors. 1.0 is steady drift, 0 is random walk, negative values are oscillation.

Magnitude alone is misleading. A profile can drift very little net but have traveled far — that's exploratory motion, not settled change. The distinction matters when reading the timeline.

The twelve dimensions

The cognitive vector's twelve axes, with a one-line definition for each. The coefficients that compute them are not public; the definitions are.

  1. 01

    Epistemic confidence

    assertion vs hedging — how firmly claims are stated

  2. 02

    Epistemic diversity

    range of stances entertained within one piece of writing

  3. 03

    Temporal orientation

    past-looking vs future-looking attention

  4. 04

    Argument density

    claims per sentence, controlling for length

  5. 05

    Conceptual leap

    average semantic distance between adjacent ideas

  6. 06

    Authority reference

    frequency of appeals to named thinkers or canonical texts

  7. 07

    First-principles reasoning

    building from axioms rather than precedent

  8. 08

    Experiential reference

    appeals to personal experience as evidence

  9. 09

    Evidential reference

    appeals to data, studies, or empirical record

  10. 10

    Dialectical complexity

    thesis/antithesis/synthesis patterns

  11. 11

    Abstraction level

    abstract vs concrete language ratio

  12. 12

    Intellectual tempo

    sentence-length variation as rhythm

Roadmap

The current 12-dimensional space is computed from interpretable, hand-specified features. The next step — in progress — is a contrastive learned projection: a small encoder trained on human-labeled pairs of "these two minds actually fit" versus "these don't." The learned embedding will sit alongside the interpretable one, and matching will blend the two.

The interpretable vector isn't going away. It is what makes the profile page legible. The learned vector is what will make the matches sharper.

What we do not disclose

Rodin publishes its methodology — but not every coefficient. Specifically, the feature weights inside the topology engine, the exact definitions of the 12 dimensions, and the internal calibration constants are kept private. The moat is not the math; the math is here. The moat is the particular choice of features, the training data, and the thousands of small decisions that produced an engine that feels right rather than arbitrary.

If you would like to audit a specific result — your profile role, your rarity score, your drift — email hello@rodin.fyi and we will walk you through your numbers.

Part Three

Questions

What writing should I paste?
Anything you've written yourself: Obsidian vaults, Notion exports, blog posts, essays, journal entries, or long threads. At least a few thousand words gives the model enough signal to find real patterns rather than surface keywords.
How accurate is the fingerprint?
Accuracy depends on the volume and variety of your writing. Most people find the themes and questions immediately recognizable. The blind spots section tends to be the most surprising. It is an interpretation, not a clinical assessment.
Is my writing stored?
No. Your source text is discarded after the fingerprint is computed. Only the derived fingerprint and the computed 12-dimensional vector are retained.
Can I delete my profile?
Email privacy@rodin.fyi with your profile URL. We will remove it within 30 days.
What does "similar thinkers" mean?
We compute a 12-dimensional cognitive vector for every profile and compare by cosine distance. The closest vectors are shown on your page — ordered by geometry, not shared keywords.