Building A Second Brain with Obsidian

2025-03-05

I started building an Obsidian vault as a second brain in 2022. It’s now three years later at the time of posting this where I've re-leant some lessons about dealing with unstructured data on the fly.

Table of Contents

If I’m using a lot of folders then Obsidian is the wrong editor for this kind of project

The Beginnning

I started building an Obsidian vault as a second brain in 2022. It’s now three years later at the time of posting this; I’ll share some areas where I either lost time or didn’t fully recognise what I was getting into when I started.

Building a second brain is arguably a classic engineering problem at this point, made even more popular by large-language models and AI showing what we can do with unstructured data.

A Second Brain runs on the basis of quick storage and quick retrieval. That sounds deceptively simple if I left it at that; it’s easier to recognise the costs you’ll be paying (in time and energy), by going into what a Second Brain isn’t nor would we want it to be.

Second Brains are not formally structured, they’re not filing cabinets and they don’t come with a bunch of colour-coordinated folders ready to welcome your stored notes from day one. This is a classic mistake that I made when I first started using Obsidian: I thought to gain insight into where my personal and professional interests meant learning how to file them away smarter and organise them harder.

A few weeks of folder management left me with a very empty feeling; I dropped the approach after the first month. Simply put: Super-organised folders are tempting, but our brains don't work this way.

To paint a full picture of why my folder-based approach didn’t work, I’ll have to use two pretty dry terms for backend software development: Read optimisation vs write optimisation.

Read optimisation is what I do and have done, as a contract backend and mobile developer working in a tech startup here in London over the last few years. So that's going to wind up being the underlying vibe tying this whole post together.

Write optimisation is the other path, based on strong hierarchy and structure (folders! sub-folders!) and I’ll shine some light on it right now to make it very clear why it’s not the right direction for building your vault as your second brain.

Obsidian Isn’t An Editor for Folder Management

Obsidian’s flexibility is both a strength and a source of potential decision fatigue

Plenty of people could pull me up on the heading above, but I’ve learnt the hard way: If I’m using a lot of folders in my Obsidian vault then - 8 times out of 10 - I’ve started out by using the wrong editor for this kind of project.

Other editors ( Notion, OneNote, Confluence, DocuWiki, Gitbook ) have more baked-in features that make a write-optimised approach more rewarding in far less time than it’d take to get your vault going with Obsidian.

Write-optimised systems are like referring to a library's ISBN catalogue. Those kind of knowledge systems are handy when:

  • Information has a clear, known structure
  • Each piece of information belongs in exactly one place
  • Relationships between information are well-understood
  • Updates are simple because you only change one record

Now this may be the approach you want if you’re working in a large establishment. Or maybe you’re a Wikipedia maintainer for a wiki that’s been around a couple of decades.

In that case, you’ll likely be maintaining a lot of pre-existing and highly codified documents; the taxonomies are well established and your energy is spent on maintaining the integrity of your docs rather than welcoming any kind of new influx of data (you’ve got auditors and curators for that).

I’ve said it above and I’ll repeat it here: You probably don’t want to be using Obsidian as your storage and editing tool if that’s the case.

But if you’re like me and you’ve got a history of enjoying work in small businesses and startups, then you often find yourself on a team where you’re looking to store a lot of new, real-time input and feedback on the fly to make sense of it later.

Hedging My Bests with the PARA Method

There is a cheat code I took at the fork in the road. Even at my old workplace, where I built “horizontal” (or read-optimal) backends, we still knew we needed some kind of structure; a devops team member to coordinate the effort at the end (and beginning of every working day).

Using the PARA system inside my vault effectively represents that same team member; a lean effort to coordinate notes in the short term.

PARA’s acronym represents Projects, Areas, Resources and Archive. You'll have to read up on the fully detailed thought process elsewhere if you're interested and never heard of it before - I mention it as a sidenote here.

I tried it, stuck with it and it’s still effective for me today.

It helps your organise your notes, but doesn’t impose so much organisation that’ll get in the way of your second brain stuff at any point.

... Now back to the main thread!

Startup or no startup, maybe you just write a lot of post-it notes and you’re looking for a way to help you find some kind of pattern within your habits.

In that case, working on structuring your notes will only get in the way; the nature of the knowledge you’re storing is fundamentally unstable and subject to change - constantly!

If only we had a second brain that could process that kind of unstructured data. How do we turn this fantasy into something real? We want the read-optimised approach.

Building A Read-Optimised Vault (With Plugin Help)

Now we know our approach is one where:

  • Information can exist in multiple contexts
  • Relationships between our notes emerge and evolve over time
  • Quick access is the priority over structured folders
  • New connections can form without us having to consciously reorganise what we knew before to recall what we know now

And, like any approach, it comes with a price! But more on the tradeoffs later…

For now, if you look at the summary of our approach above, I think you’ll agree we’re basically looking to mimic our human noggin’.

Much like how our brain forms new neural pathways through learning and experience, we want our Obsidian vault to be a living, breathing expanse of knowledge that can re-write these pathways with the more notes we throw at it.

Too Many Plugins, Too Much Help

I’d love to tell you Obsidian can do this straight out of the box, but then I’d never gotten around to spending months writing, testing, benchmarking and re-writing my own plugin in my vault.

The plugin is still unpublished and strictly for my own use at the time of writing, so don’t worry: This isn’t some kind of long-form article leading up to me promoting any kind of link at the end. It’s strictly to warn about the pitfalls I’ve been through so you don’t have to fall into them yourself.

The first pitfall was making a lot of folders, and the second pitfall was thinking I could customise my Obsidian vault to the hilt with community plugins. Maybe if I could get it just right… right?

Obsidian’s flexibility is both a strength and a source of potential decision fatigue. It’s definitely an overall benefit that the editor is so customisable through third-party plugins, but time goes by and you have to stay focused on the ultimate goal. You can’t be giving yourself side quests trying to tweak plugins into a failed attempt at covering all the bases.

Planning for scale is not easy; it’s also where the costs of our “quick storage, quick retrieval” expanse start to kick in at a greater rate; costs we’ve been laying off until now.

The Expanse: Plugins, Knowledge and All Things Scale

How we find connections needs to be expressed in a way that scales with what we know

It’s all well and good storing disparate, seemingly separate notes into your vault on the fly. It’s even fun until it just starts to look like chaos with no hope of order, pattern or insight.

Eventually you do want someone to figure out how your notes are connected, you want relationships (and structure! the very thing we gave up!) between notes, and you want to infer where exactly everything you know and learnt is taking you next on your journey. Pure discovery.

That’s where our vault’s brain needs to get to calculating a lot of things on demand. It’s not going to be easy and I haven’t completely finished building my answer to this just yet.

But I have learnt (or re-learnt) some valuable lessons with the plugin I’ve built so far.

Lessons from Writing Another Obsidian Plugin

Mia Yim is actually my third Obsidian plugin, but the first where I’ve aimed for some undiscovered country.

Lesson One - Build on Top of the Editor’s Two Core Strengths

In the name of moving fast on the fly, there are two features Obsidian does brilliantly out of the box:

  1. Obsidian’s brings a combo of live markdown rendering and the universal portability of the markdown format. It means you can share notes easily, yet also be confident about how you’ve written up and formatted your note content on the spot. VERY few editors offer this with such ease. Some have made good efforts catch up with Obsidian in this area, but the Obsidian team raised the bar here in general.
  2. Obsidian brings a Graph View to the party - a view that you can easily manipulate to try establishing patterns when looking at your notes from a bird’s eye view.

It took me over a year (on and off usage) to recognise that building a knowledge vault to scale starts on top of these two features as assets, instead of building around them.

Lesson Two - Let Real-World Data Set Your Boundaries

This is not really a new lesson for me, but it never fails to amaze how much I end up re-tracing steps on old lessons learnt (maybe a huge reason why I need a Second Brain to cut that out!). I learnt this approach back at Packfleet and it’s never really left me.

Keeping in mind how my plugin - Mia Yim - needed to inform the Graph View (for long-term pattern seeking) I decided I needed to tackle three specific needs with an immediate, live data pipeline:

  1. I needed to seed our knowledge vault with a pre-existing, pre-defined set of topics. I decided a useful way to pull in some topical boundaries would be pulling in the existing, available docs of my favourite codebases on Github - some codebases which I already helped maintain as an open-source developer, others large enough that I was hoping to understand them well enough to make a significant contribution down the line.

  2. My knowledge vault would grow by tracking the community use of these codebases and the issues that these respective community flagged along the way. Much like how users hit the in-app feedback button when letting you know your startup’s service has run into problems you couldn’t have predicted.

  3. I needed my vault (and especially my graph view) to be able to track and measure the DIFFERENCE between existing, documented code and growing, undocumented real-world use.

I’m not the first person to try plugging to gap between how things should work on paper and how they REALLY work in practice, but my heart is in it (has been for years - no matter what job or contract I was working at the time) and that’s when you know you’ve got a personal interest that you have to follow through to the end.

Lesson Three - Scaling Relationships is Hard Work

Again, not so much a new lesson but more like a lesson revisited. My old boss Hugo Cornejo would tell you everything you need to know about managing the costs of scaling any project.

I mentioned earlier on that we’d be “laying off” costs (mostly computational, but also time and energy behind the scenes trying to make this pattern matching stuff work!) by taking the read-optimised route as our second brain starting point.

We’re quick and cheap on reads, but now it’s obvious our plugin will be doing the heavy lifting and paying the price on writes.

How we calculate those connections needs to be expressed in a way that scales with our vault’s knowledge.

The challenge of scale in knowledge management isn't just about storing more notes - it's about maintaining meaningful relationships between them as they grow. Think about it this way: when your vault has 100 notes, you might manually tag and connect them. But what happens at 1,000 notes? Or 100,000? The relationships you establish need to deliver consistent insight regardless of scale.

These relationships need to reveal three things:

  1. Patterns you already anticipated and want to track
  2. Connections that confirm your existing understanding
  3. Insights you never expected to find

Arbitrary tagging (like tagging everything that mentions a particular word) creates noise rather than insight. This is where Mia Yim steps in.

ONE of the plugin’s core jobs is to suggest new “writes” on demand - specifically, to analyse notes and suggest meaningful tags at the click of a button. But this presents its own technical challenges:

  1. Performance: The analysis needs to happen without slowing down Obsidian’s live rendering.
  2. Accuracy: The suggestions need to be meaningful enough to create valuable patterns in the Graph View
  3. Scale: The quality of suggestions needs to remain consistent as the vault grows

I’ll admit I got very myopic about accuracy for months, writing a testing suite that ultimately was semi-redundant.

After months of development, I had an important reality check: perfect tag-matching accuracy wasn't actually the goal. Users want suggestions, not decisions. I'd been used to building last-minute “manual overrides” (which are nothing more than buttons on an app screen for the user to have the final say) into apps at work. Here was no different.

This took me away from quantitative testing to focusing on prioritising quality over quantity; live insight over mere static analysis. Could I really come up with a “pattern discovery assistant” worth a damn? It reminds an open question I’m working on.

And yeah, I’m still doing it without involving any kind of neural network or LLM in the pipeline. I like AI in a lot of experimental phases on some projects, but I couldn’t see benefit from vectorising notes in this particular case, and that remains my take on it right now.

How It’s Going With Mia Yim (Beta)

Quality over quantity

By suggesting possible tags rather than enforcing them, the basic service is for Mia Yim to help bridge the gap between flat, horizontal storage and meaningful relationships, while keeping the user in control of their knowledge organisation.

The next frontier is pattern inference - the huge second service that I haven’t solved completely yet. But Mia Yim, even in its beta stage, is proving that a building a second brain is still just as fun as ever.

graph TD Documentation[Documentation Base] Plugin[Mia Yim Plugin] subgraph Internal["Internal Analytics"] TopicHealth[Topic Health Analysis] ChapterOrg[Chapter Organisation] TopicRelation[Topic Integration Analysis] DocStructure[Documentation Structure] end subgraph External["External Analytics"] RealWorldUsage[Real-World Usage Analysis] EmergingTopics[Emerging Topics Tracker] UserLanguage[User Language Patterns] end Documentation -->|feeds into| Plugin Plugin -->|analyses| Internal Plugin -->|connects with| External Internal -->|strengthens internal structure| Documentation External -->|expands documentation reach| Documentation %% Styling classDef docs fill:#f5f5f5,stroke:#333,stroke-width:1px classDef plugin fill:#f9f,stroke:#333,stroke-width:2px classDef internal fill:#bbf,stroke:#333 classDef external fill:#bfb,stroke:#333 class Documentation docs class Plugin plugin class Internal,TopicHealth,ChapterOrg,TopicRelation,DocStructure internal class External,RealWorldUsage,EmergingTopics,UserLanguage external

It’s also reminded me of a few core interests that never really left me. I once again found the courage (and clarity) to work on improving my delivery of those interests: prioritising intelligent suggestions over rigid automation, focusing on quality over quantity, and remembering that the ultimate goal is to empower users to tend and prune their own digital garden.

Building a Second Brain isn't about replicating perfect order — it's about embracing the fluid nature of knowledge. People may convince you to do that through writing good old-fashioned algorithms or by simply embracing AI… frankly, it doesn’t really matter tools you reach for at the ground level. Get started and have fun doing it.

What matters in the long run is that embracing the fluid nature of knowledge means finding the patience to listen to people’s issues, including your own, even when those issues are coming at you thick and fast with some noise attached.

This journey to filter away that noise, to distill the underlying signals and uncover the common ground we share — that journey is ongoing. So, let's keep exploring, keep building, and keep listening. The insights we uncover together will be the seeds of something extraordinary.