I’m building a new company. I want to create a search engine, Boundless, that will let you search data across all your apps through a single interface.
This memo details the significance of the idea and my immediate plan.
Please send all the feedback that you have, especially if it’s negative. You can leave comments below this post or message/email me if you prefer.
Our entire lives are etched into the memories of our computers. Yet there is only a tiny fraction of information that is readily accessible at any given moment — like the files on your desktop or the items at the top of your notes app/downloads folder/social media feed. Everything else gets lost in your “digital basement.” Information collects there easily, but finding anything specific is difficult and often so time-consuming that searching for it is pointless.
A lot of transient data is naturally accumulated in your basement on a daily basis, such as your texts or search history. I remember reading an article a couple weeks ago that discussed the merits of an operating system that separates data from user interfaces. But I have no idea who the author is or what website the article is hosted on, so searching for it didn’t work. I ended up having to manually scroll through several weeks of browser history in order to find it.
Sadly, even if I try my hardest to keep all my important information accessible, it ends up in the basement anyway. As I’m reading a book, I meticulously highlight and tag important quotes, then sort them into their appropriate folders. Yet a couple weeks later when I’m wondering “what was that one thing I read about finding a good co-founder?” the search process is still laborious. I end up trying a combination of filtering by tags, repeatedly tweaking keyword searches, and digging through several layers of nested folders and documents.
Even more troubling is that most of the quotes I highlight are never revisited. When the time comes where a highlighted quote would be immediately useful, I often have such a vague idea of what I’m looking for that I don’t even know how to begin searching for it.
Imagine if you could resurface anything you’ve learned instantly, whenever you need it.
In order to do this, we need a search interface that allows you to perform queries across all your data — your notes, documents, emails, browser history, tweets, rss feeds, etc. And it has to be super fast, super accurate, and feel super natural to use.
We don’t index information in our brains by keywords, the way most search interfaces do. Usually we only remember the essence of an idea; all the specific words we read are forgotten. And we don’t remember ideas as isolated pieces of information like our computers do. When we understand an idea deeply, our brain links it to other related concepts and facts. We can visualize our cumulative knowledge as a vast graph of interconnected information.
This new search interface has to have a deep understanding of what you’re looking for, and relate information together similarly to the way our brains do. This necessitates a search algorithm with advanced natural language processing, one that is better than the state of the art.
This algorithm should be so good that you can search a book for a concept and return a list of quotes that match your query. For example, you should be able to search Nietzsche’s “On the Genealogy of Morality” with the query “why did Nietzsche think the power of forgetting was so important?” and the algorithm should find several quotes that help answer that question, even if none of them contain the words “power of forgetting”.
When Boundless is able to achieve this level of fidelity, there will be secondary benefits besides effective information retrieval. For one, Boundless will render it unnecessary to organize your documents and notes. You’ll be able to find anything via search much faster. Even highlighting important quotes from books and articles will become unnecessary — you’ll be able to search through text indexed by concepts, a far more effective method than scrolling through a list of highlights. Organizing and highlighting will become activities solely to help your own thinking, and not a necessity for information retrieval.
But most importantly, our relationship with our computers will change. We will begin to rely on our computers to function as an extended memory — a true “second brain” — infallible and virtually infinite in capacity.
As the Boundless search algorithm is refined, our second brain will become more and more alive. It’ll go beyond helping us store and recall information. It’ll help us process it. Boundless will be able to connect new ideas we encounter to ideas we’ve learned before.
Imagine the following scenario: I’m reading Paul Graham’s essay “Before the Startup,” and I encounter the quote “the way to get startup ideas is not to try to think of startup ideas.” As I’m seeing this, Boundless pulls up a related quote from an article by Alexander Obenauer that I read several weeks ago: “As creators, ideas are our lifeblood. Exploring them, even without executing on them, has great merit. Knowing how to come up with and develop ideas effectively is a tremendous skill.” And now, because Boundless has resurfaced that quote at the right time, I’m thinking critically about the value of deliberate ideation. Are these two ideas from Graham and Obenauer contradictory? How do I know which of my ideas to indulge? There’s a good chance I wouldn’t have considered these questions at all if Boundless didn’t help me make the connection between Graham and Obenauer’s articles.
Boundless will help you think about ideas from new perspectives, identify hard to see connections to related ideas, and chase down the multitude of implications of every new thought. It transcends the category of productivity and search tools. Boundless provides a medium for thought that will augment human intelligence.
I believe I need to approach building this product differently than the “by the book” Lean Startup/Y-Combinator approach.
Conventional wisdom dictates I should build an MVP and launch as fast as I can. If I went down this path, the first public version of Boundless would likely support full-text keyword searches and a few important integrations (maybe local documents, Instapaper, and Twitter?). I would iterate from there, adding more integrations and improving keyword search using state of the art NLP algorithms.
Let’s optimistically assume Boundless is able to reach product-market fit after iterating on this MVP. Even in this best case, the product’s potential will be stuck at a local maximum. In order to create a medium for thought and creativity, not just a search tool, it becomes necessary to do novel NLP research. And at this point, research will require a team of talented engineers and a lot of money.
Now if we’re realistic, the most likely outcome is mild growth or no growth at all. If this happens, Boundless will be at a dead end unless I undertake NLP research.
The reason YC and the Lean Startup advocate launching ASAP is because you can learn a lot from your initial users. If no one is using your product, maybe you’re not solving a pressing problem. And if you find some power users, you can study their usage to keep product development from veering off course. But if NLP research will be inevitable no matter what I learn from users, the better path forward is to start with research first.
There are a host of benefits that come from taking this approach. Most importantly, it will give Boundless a substantial unfair advantage. Not only will this new technology make search an order of magnitude better, it will also be difficult for competitors to replicate. I’ll be much more likely to reach product-market fit, and even if I don’t, I’m still in a strong position.
If it somehow turns out that no one wants a cross-app search engine, research would open the door to an entire new category of products I can still create. What if there was an OS-level command bar that lets you execute actions across any app with natural language? What about an intelligent search API that developers integrate into their wikis, documentation sites, and knowledge bases that’s powered by the Boundless algorithm? Or what if there was a note-taking app that completely eliminates the need for folders/tagging and automatically links related notes for you? There are a lot of possibilities here.
In the final analysis, everything hinges on whether the research I’m proposing is possible and if I’m capable of making substantial progress in the span of years, not decades. But unless I determine it’s totally unfeasible, research seems like the best path forward.
Please let me know if there are any important considerations that I’m overlooking.
Personal website: ronith.co
Boundless website: boundless.so