Francesco Di Costanzo
Back to articles

(19) The Real Productivity Gain Is Not Retrieval, It Is Recombination

Download paper (PDF)

There is a standard pitch for personal knowledge management tools that goes roughly as follows: build a second brain, link your notes, ask the AI, get smarter. The pitch is not wrong. It is aimed at the wrong bottleneck.

Most productivity writing focuses on retrieval — finding the document, surfacing the insight, reducing the time between having a question and holding an answer. Retrieval is genuinely expensive. The Microsoft Work Trend Index from 2025, analyzing 31,000 workers across 31 markets, found that employees were interrupted every two minutes, roughly 275 times a day. McKinsey Global Institute reported in 2012 that interaction workers spend approximately 19% of their time simply searching for information they need to complete tasks. The EY Work Reimagined Survey in 2025 found that 88% of employees were using AI at work, concentrated almost entirely on basic search and summarization. Only 5% had fundamentally transformed how their work gets done.

This is the honest starting point: retrieval is not solved. But once retrieval is adequate, the tools that actually compound your capability over time are not the ones that help you find things faster. They are the ones that help you put separate things together in ways they were not before.

The Two-Layer Problem

The clearest frame is two layers. At the organizational level, fragmentation and retrieval friction remain the dominant operational costs. The Mark, Gonzalez, and Harris field study from 2005 followed 24 information workers and found they switched tasks every three minutes on average, with 57% of working spheres interrupted. When interrupted, they resumed the original task only after an average of 2.3 intervening activities. More recent longitudinal work by Gloria Mark, described in a Steelcase interview in 2024, found that average screen focus time before switching had fallen to 47 seconds.

At the individual top-performer level, the highest-value upside appears after the document has already been found: connecting an old client insight to a new product brief, surfacing a postmortem that is structurally identical to an active decision, noticing that a research finding from one domain maps precisely onto a constraint in another. These are not retrieval operations. They are recombination operations — and they are what separate people who use archives from people who compound them.

Why Second Brains Were Always About Expression

Tiago Forte's CODE framework — Capture, Organize, Distill, Express — is often described as a storage system. Its terminal verb is not Archive. It is Express. The PARA organizational structure keeps material sorted by actionability and proximity to current work, not by subject taxonomy. Progressive Summarization compresses notes not to reduce file size but to communicate a fragment to a future self who will encounter it in a different context with a different problem. The whole system is engineered for re-encounter and reuse.

This design lineage is older than software. Commonplace books — the practice, common among educated Europeans from roughly the 12th century onward, of collecting quotations under reusable headings — were explicitly oriented toward combinatorial creativity. Erasmus advised in 1512 that an abundant stock of material be collected under topic headings "to assist free-flowing oratory." Yale University Library's documentation of the form describes it as producing material intended to be repurposed across contexts. The failure mode was documented then too: large commonplace books could become evasions of reading rather than aids to thinking. The same failure mode applies to any modern vault that accumulates without distilling.

Niklas Luhmann's Zettelkasten sharpened the logic. Luhmann accumulated approximately 90,000 index cards over more than 40 years, producing 58 books and over 600 academic articles. The critical design choices: one atomic idea per card; a unique address enabling any card to reference any other; explicit written links made at the moment of writing, when connections are most visible. Luhmann described the system as becoming a communication partner that could produce unexpected connections that surprised even him — a theory of controlled serendipity, not archival order. Steve Jobs' line about connecting the dots only backward is analytically precise: systems increase the density of candidate connections, but meaning is established by acting on one and surviving contact with reality.

Recombination Is How Creativity Works

The cognitive science literature makes a strong case that this is not just metaphor. Sarnoff Mednick's 1962 associative theory defined creative thinking as "the forming of associative elements into new combinations which either meet specified requirements or are in some way useful." He described three mechanisms: serendipity, similarity, and mediation. The more remotely the associative elements are drawn from each other, the more creative the combination is judged to be.

Dedre Gentner's structure-mapping theory explains why this matters for AI-assisted recombination. Analogy is not about surface similarity — it is about aligning relational structure across two domains. The systematicity principle holds that predicates belonging to a mappable system of mutually interconnecting relationships transfer more reliably than isolated predicates. This distinction explains why shallow pattern matching fails: a system that identifies statistical co-occurrence without relational depth produces weak analogies that look compelling and break on contact with domain constraints.

A 2023 meta-analysis by Gerver and colleagues covering 79 studies involving 12,846 participants found a small but statistically significant correlation of 0.19 between memory and creative cognition. Semantic memory — particularly verbal fluency, the ability to strategically retrieve material from long-term memory — accounted for more of the relationship than episodic memory or working memory. A 2025 study using network analysis of participants' semantic memory structure found that broadly connected semantic memory networks that avoided tightly clustered structures predicted objective creative originality. The domain-expertise constraint is irreducible: John Baer's research on domain-specific creativity argues that productive recombination requires enough expertise to distinguish a shallow association from a genuine structural alignment. The archive contributes material; the LLM contributes candidates; the expert contributes the judgment no other layer can supply.

Where LLMs Actually Help

LLMs support recombination through at least six concrete mechanisms: compressing long texts into shorter representations that preserve essential structure; placing multiple documents into a common evaluative frame to surface contradictions and overlaps; retrieving text by conceptual proximity through embeddings rather than keyword overlap; clustering semantically related material that uses different language; proposing structural analogies and alternative framings; and synthesizing a candidate draft over a bounded corpus of source material. Retrieval-Augmented Generation pipelines ground LLM output in retrieved passages from an external knowledge base — the technical architecture that makes running these operations over a personal archive possible.

The controlled experimental evidence on what this delivers is real but tightly bounded. Noy and Zhang, publishing in Science in 2023, found that ChatGPT access let workers complete professional writing tasks 40% faster at 18% higher quality, with the largest gains for lower-ability workers. Brynjolfsson, Li, and Raymond found that an AI conversational assistant raised customer support productivity by 14% on average, with 34–38% improvement for novice workers and minimal impact on the most experienced. Dell'Acqua and colleagues at Harvard Business School ran a pre-registered experiment with 758 BCG consultants: inside the AI capability frontier, GPT-4 raised task completion by 12.2%, speed by 25.1%, and quality by more than 40%. Below-average performers improved 43%; above-average performers improved 17%.

The same study found that for a task designed to fall outside the frontier — requiring contextual, integrative judgment — consultants using GPT-4 were 19 percentage points less likely to produce correct solutions than those without it. The authors described this as a "jagged technological frontier": uneven across task types, with the boundary often invisible to users in real time. The implication is not that LLMs fail at synthesis. It is that the synthesis tasks where they genuinely accelerate work — comparative, corpus-based, source-grounded, with a bounded input set — are specific, and the user needs to know which side of the frontier they are on.

Graph Views — Interface Versus Epistemology

Obsidian's core relational features — backlinks, unlinked mentions, local graph, global graph, tags — are designed to reveal relationships, not just store files. The backlinks pane shows all notes that link to or mention the current note, including unlinked mentions where a note's title appears as plain text without a formal link. These are cognitively plausible aids to noticing clusters, bridges, and candidate connections in the vicinity of a live problem.

The empirical evidence for graph-based knowledge representation in learning settings is genuinely positive. A 2006 meta-analysis by Nesbit and Adesope reviewing 55 studies involving 5,818 participants found meaningful learning benefits from concept and knowledge maps. A 2018 meta-analysis covering 11,814 participants found a moderate effect size of 0.58, with creating maps (0.72) outperforming merely studying pre-built ones (0.43). The catch: these studies examine structured, pedagogically designed maps in explicit learning tasks — not organic, professionally grown note vaults.

What the information visualization literature does support is a hard cognitive limit. Yoghourdjian and colleagues, in a 2018/2019 survey of empirical studies on graph visualization, found that roughly three-quarters of studies used graphs of 100 nodes and 200 edges or fewer — an implicit recognition that larger graphs are cognitively unmanageable for most analytical tasks. Cognitive difficulty increases sharply beyond about 100 nodes, with significant errors on path tasks in high-density graphs over 50 nodes. Okoe, Jianu, and Kobourov found in 2023 that node-link diagrams and adjacency matrices each outperform the other on different task types — the representational choice must match the cognitive task. A vault-wide graph of hundreds of notes is primarily motivational. Local graph and backlinks during active work on a specific question are genuinely analytic tools. The distinction is: graph is useful as an interface; risky as an epistemology.

The Failure Modes Are Structural

The most important finding for anyone building a recombination workflow is that the failures of AI-assisted synthesis are not random — they are systematic and documented.

Uwe Peters and Benjamin Chin-Yee published a study in 2025 testing 10 prominent LLMs on 4,900 summarization tasks. LLM summaries were nearly five times more likely than human-authored summaries to contain broad generalizations. DeepSeek, ChatGPT-4o, and LLaMA 3.3 70B overgeneralized in 26% to 73% of cases. Prompting explicitly for accuracy made the problem worse. The mechanism is structural: models trained on human science writing inherit its tendency toward accessible, broadly applicable summary, and reinforcement from human feedback rewards fluency over scope-fidelity. In a recombination context, this is the core risk — weak connections sound compelling.

Lewis, Mitchell, and colleagues published work in 2024 testing LLMs on variants of analogy tasks. GPT models showed accuracy above 90% on basic letter-sequence transformations. Performance dropped 30–40% on multi-step transformations and fell below 50% on novel alphabet systems dissimilar from pre-training data. Unlike humans, models were susceptible to answer-order effects and paraphrasing. Genuine structural analogy — Gentner's alignment of relational systems — remains more brittle in LLMs than benchmark performance implies.

Buçinca, Malaya, and Gajos at Harvard ran an experiment with 199 participants on AI-assisted decision-making. Users frequently accepted incorrect AI predictions even when they would have done better without AI. Adding explanations did not reduce overreliance — it sometimes increased it, because explanations were interpreted as a global signal of competence rather than evaluated individually. Cognitive forcing functions — requiring participants to explicitly engage with AI recommendations before accepting them — did reduce overreliance, but the more effective the friction, the less users preferred it. This is a workflow design problem, not a model problem.

A Workflow That Holds Up

Capture selectively against live projects, questions, and decisions. Write in your own words, preserve the source, and distill at capture time so that a future self in a different context can determine quickly whether the note is relevant to a new problem. Volume does not correlate with synthesis quality; discipline does. Research on note-taking and comprehension also offers a relevant caution: a Cambridge University Press and Microsoft Research study published in 2025 found that traditional note-taking outperformed LLM-only study for reading comprehension and retention, though the combination of both performed well. Substituting AI summarization for active processing carries a retention cost that compounds over time.

Create links where meaning exists, not where software makes linking easy. Every explicit link should be a cognitive claim: these ideas are related in a way that matters for a current or likely future problem. Linking under low friction — because a plugin suggests it, because a keyword matches — creates the appearance of a rich network without the substance.

Use backlinks, unlinked mentions, and local graph as active exploration tools during live work on a specific question. Use LLMs on bounded, multi-document packets oriented toward comparison, clustering, and contradiction-surfacing — not toward single-document summarization over material you already understand. Build cognitive friction into the verification step: check proposed connections against original sources, reject structurally weak analogies, and treat AI-generated synthesis as a draft hypothesis, not a completed conclusion.

The productivity gap that will compound over the next five years is not between those with LLM access and those without — frontier model access is commoditizing rapidly. It will be between those who built a disciplined, well-linked archive that can ground a frontier model in unique personal experience — decisions made, patterns noticed, postmortems written — and those who interact with generic AI over generic data. Retrieval is table stakes. Recombination, executed with discipline and skepticism, is the moat.

Sources

Work Fragmentation and Attention

  1. Gloria Mark, Victor Gonzalez, Justin Harris, "No Task Left Behind? Examining the Nature of Fragmented Work," CHI 2005 https://www.ics.uci.edu/~gmark/CHI2005.pdf

  2. Gloria Mark, Steelcase Research interview on 47-second attention span (2024) https://www.steelcase.com/research/articles/our-47-second-attention-span-with-gloria-mark-s5-ep3-transcript/

  3. Gloria Mark, Gallup interview on interrupted work resumption (2006) https://news.gallup.com/businessjournal/23146/too-many-interruptions-work.aspx

Enterprise Productivity and AI Adoption

  1. Microsoft Work Trend Index 2025 — employees interrupted every two minutes, via CNBC https://www.cnbc.com/2025/05/19/microsoft-says-employees-are-interrupted-every-two-minutes.html

  2. Microsoft Work Trend Index 2025 — Swiss full-detail report https://news.microsoft.com/de-ch/2025/06/17/new-microsoft-study-reveals-the-rise-of-the-infinite-workday-40-of-employees-check-email-before-6-a-m-evening-meetings-up-16/

  3. McKinsey Global Institute, "The Social Economy" (2012) https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/the-social-economy

  4. McKinsey Global Institute, "The Social Economy" executive summary PDF (2012) https://www.mckinsey.com/~/media/mckinsey/industries/technology%20media%20and%20telecommunications/high%20tech/our%20insights/the%20social%20economy/mgi_the_social_economy_executive_summary.pdf

  5. EY Work Reimagined Survey 2025 https://www.ey.com/en_gl/newsroom/2025/11/ey-survey-reveals-companies-are-missing-out-on-up-to-40-percent-of-ai-productivity-gains-due-to-gaps-in-talent-strategy

Second Brain Frameworks and PKM Methodology

  1. Forte Labs, PARA method official documentation https://fortelabs.com/blog/para/

  2. PARA method overview — get-alfred.ai https://get-alfred.ai/blog/para-method

Historical Antecedents: Commonplace Books and Zettelkasten

  1. Yale University Library, commonplace book online exhibition https://onlineexhibits.library.yale.edu/s/not-reading/page/commonplace-book

  2. Farnam Street, networked knowledge and combinatorial creativity https://fs.blog/networked-knowledge-and-combinatorial-creativity/

  3. Commonplace book — Wikipedia https://en.wikipedia.org/wiki/Commonplace_book

  4. Niklas Luhmann Zettelkasten methodology — get-alfred.ai https://get-alfred.ai/blog/zettelkasten

  5. Luhmann original Zettelkasten detailed analysis — ernestchiang.com (2025) https://www.ernestchiang.com/en/posts/2025/niklas-luhmann-original-zettelkasten-method/

  6. Zettelkasten introduction — zettelkasten.de https://zettelkasten.de/introduction/

Cognitive Science: Associative Creativity and Analogy

  1. Sarnoff Mednick, "The Associative Basis of the Creative Process," Psychological Review (1962) — Semantic Scholar https://www.semanticscholar.org/paper/The-associative-basis-of-the-creative-process.-Mednick/927c10385d93d538e2791f8ef28c5eaf96e08a73

  2. Sarnoff Mednick (1962) — direct PDF https://pdfs.semanticscholar.org/927c/10385d93d538e2791f8ef28c5eaf96e08a73.pdf

  3. Dedre Gentner, "Structure-Mapping: A Theoretical Framework for Analogy," Cognitive Science (1983) https://groups.psych.northwestern.edu/gentner/papers/Gentner83.2b.pdf

  4. Dedre Gentner and Arthur Markman, "Structure Mapping in Analogy and Similarity," American Psychologist (1997) https://groups.psych.northwestern.edu/gentner/papers/GentnerMarkman97.pdf

  5. Structure-mapping theory — Wikipedia https://en.wikipedia.org/wiki/Structure-mapping_theory

Cognitive Science: Semantic Memory and Creativity

  1. Gerver et al., "Memory and creativity: A meta-analytic examination," 79 studies, n=12,846 (2023) — Penn State https://pure.psu.edu/en/publications/memory-and-creativity-a-meta-analytic-examination-of-the-relation

  2. Semantic memory networks and creative originality — BMC Psychology (2025), n=106 https://pmc.ncbi.nlm.nih.gov/articles/PMC12288250/

  3. Semantic memory richness and creative idea production — Thinking & Reasoning, PMC (2022) https://pmc.ncbi.nlm.nih.gov/articles/PMC10128864/

LLM Productivity: Controlled Experiments

  1. Noy and Zhang, "Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence," Science (2023) https://www.science.org/doi/10.1126/science.adh2586

  2. Noy and Zhang — SSRN preprint https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4375283

  3. Brynjolfsson, Li, and Raymond, "Generative AI at Work," NBER Working Paper 31161 (2023) https://www.nber.org/papers/w31161

  4. ITIF summary of Brynjolfsson, Li, and Raymond https://itif.org/publications/2023/07/10/customer-support-agents-using-ai-gpt-tool-saw-nearly-14-percent-increase-in-productivity/

  5. Dell'Acqua, McFowland, Mollick et al., "Navigating the Jagged Technological Frontier," HBS Working Paper 24-013 (2023) https://www.hbs.edu/faculty/Pages/item.aspx?num=64700

  6. Harvard Crimson coverage of BCG / Dell'Acqua et al. study https://www.thecrimson.com/article/2023/10/13/jagged-edge-ai-bcg/

  7. Professor KL Substack, BCG frontier follow-up research (2026) https://professorkl.substack.com/p/discovering-ais-jagged-frontier-and

LLM Failure Modes: Summarization, Analogy, Overreliance

  1. Uwe Peters and Benjamin Chin-Yee, LLM summarization overgeneralization, Royal Society Open Science / arXiv (2025) https://arxiv.org/abs/2504.00025

  2. Healthcare-in-Europe coverage of Peters and Chin-Yee https://healthcare-in-europe.com/en/news/generative-ai-llm-exaggeration-science.html

  3. Martha Lewis, Melanie Mitchell et al., LLM analogy brittleness, arXiv:2411.14215 (2024) https://arxiv.org/abs/2411.14215

  4. AI Guide for Thinking Humans: LLM analogy stress testing (2024) https://aiguide.substack.com/p/stress-testing-large-language-models

  5. Buçinca, Malaya, and Gajos, "To Trust or to Think," ACM CSCW (2021) https://www.eecs.harvard.edu/~kgajos/papers/2021/bucinca21trust.pdf

  6. Buçinca, Malaya, and Gajos — arXiv preprint https://arxiv.org/abs/2102.09692

Graph Visualization and Concept Mapping

  1. Nesbit and Adesope, "Learning with Concept and Knowledge Maps: A Meta-Analysis," Review of Educational Research (2006) https://www.sfu.ca/~jcnesbit/research/NesbitAdesope2006.pdf

  2. 2018 meta-analysis on concept map learning outcomes, n=11,814 — WSU Library https://rex.libraries.wsu.edu/esploro/outputs/journalArticle/Studying-and-Constructing-Concept-Maps-a/99900601155801842

  3. Harvard ABLConnect summary of Nesbit and Adesope findings https://ablconnect.harvard.edu/concept-map-research

  4. Yoghourdjian et al., "Exploring the Limits of Complexity: A Survey of Empirical Studies on Graph Visualisation," Visual Informatics (2018/2019) https://www.cg.tuwien.ac.at/research/publications/2019/YOGHOURDJIAN2019/YOGHOURDJIAN2019-paper.pdf

  5. Okoe, Jianu, and Kobourov, "Node-link or Adjacency Matrices," IEEE TVCG (2023) https://www.computer.org/csdl/journal/tg/2023/01/09908291/1HbasfaWNX2

Obsidian and PKM Tools

  1. Marc Littlemore, "A Beginner's Guide to Note-Taking in Obsidian" https://www.marclittlemore.com/beginners-guide-note-taking-obsidian/

  2. Obsidian Forum — graph view and unlinked mentions https://forum.obsidian.md/t/graph-view-option-to-show-connections-based-on-unlinked-mentions/8802

  3. Eleanor Konik, "In Defense of Obsidian's Graph View" (2021) https://www.eleanorkonik.com/p/its-not-just-a-pretty-gimmick-in-defense-of-obsidians-graph-view

  4. Reddit r/ObsidianMD — "Is graph view really useful?" (2025) https://www.reddit.com/r/ObsidianMD/comments/1kbgrs1/is_graph_view_really_useful/

  5. Eric Ma, "Mastering Personal Knowledge Management with Obsidian and AI" (2026) https://ericmjl.github.io/blog/2026/3/6/mastering-personal-knowledge-management-with-obsidian-and-ai/

Technical Documentation: RAG and Embeddings

  1. Retrieval-Augmented Generation overview — Prompting Guide https://www.promptingguide.ai/research/rag

  2. Google Cloud, Retrieval-Augmented Generation overview https://cloud.google.com/use-cases/retrieval-augmented-generation

  3. OpenAI embeddings for semantic search — Milvus AI Quick Reference https://milvus.io/ai-quick-reference/how-do-i-use-openais-embeddings-for-semantic-search

Note-Taking, AI Assistance, and Comprehension

  1. Cambridge University Press / Microsoft Research, traditional note-taking vs. AI — Computers & Education (2025) https://phys.org/news/2025-12-traditional-ai-chatbots-comprehension-combined.html

  2. AI assistance dilemma in note-taking — arXiv:2509.03392 (2025) https://arxiv.org/html/2509.03392v1