announcing clew: my search engine

I’ve been posting my exploits in beginning to code a search engine here on this blog, and I’m happy to announce that I’ve finally picked a name and set up a placeholder site.

clew!

The name I chose is “Clew”, and I put a good deal of thought into that choice. A clew is a ball of thread or yarn; specifically, I envisioned the clew from the Greek myth of Theseus and the Minotaur which leads Theseus safely through the labyrinth. This is an excellent metaphor for the task of a search engine; its job is to guide you safely through the labyrinthian World Wide Web safely to your destination.

I also incorporated other names from the Theseus myth into our naming; the web crawler, for example, is named after Ariadne, who gave Theseus the clew.

You can view the site at clew.se.

faq

As I anticipated, I’ve received a number of questions about the crawler.

what programming language will you use?

I’m using Python for the crawler, though I initially wrote it using bash scripts.

I haven’t decided the language for the search backend yet, I’ll burn that bridge when I get there.

why the .se tld for the domain?

.se is intended, in an ideal world, for Swedish websites. I’m not Swedish and I’ve never been to Sweden.

That said, when I settled on “Clew” as the name, I was faced with an issue: domain name availability is always tricky. In the case of “clew”, a number of the more conventional TLDs have been taken already, usually by squatters hoping to sell them to people like me at a very high price (which I can’t afford).

Faced with a number of imperfect options, I chose .se, rationalising the “se” as standing for “search engine”.

I hope that’s not a dealbreaker. I expect it won’t be.

why doesn’t the search bar do anything yet?

As I’ve mentioned, I’m not anywhere near ready to publish any results yet. The current site is a placeholder and showcases my current mockup of the design as the landing page.

will this be a privacy-focused search engine?

Privacy isn’t the primary focus of Clew, but all of my projects try to respect people’s privacy as much as possible.

I do have some privacy features in mind that I’ll likely implement. If you have ideas about features you’d like to see in a search engine (privacy-focused or otherwise), feel free to contact me!

do I need any help with designing a logo or anything like that?

I’ve had multiple people offer already; I’m not quite at that stage yet, though. Once I am, I’ll probably contact them directly.

will this be self-hostable?

I am planning to open source the code once it’s more mature, and I would like very much for it to be easily self-hostable. I’d like to be able to provide database dumps of our index to help people easily seed their own instances with a bunch of data without having to run crawlers for weeks to get good data, but I’m also carefully thinking through the implications of that: I don’t want to just provide an easy dataset for the training of ML models like ChatGPT. That would be very disrespectful to the sites I’m crawling.

I guess the answer for now is, “I don’t know what form this will be in; I have to balance making things easier for people wanting to take control of their data with respecting the sites I’m crawling.” I don’t like that, but it’s the world we live in.

when will this launch?

I have zero idea. :D

do you need financial contributions?

(yes, I have actually been asked this, it’s not just self-promo)

Not much at this stage, but once I start renting server space, costs may pile up. But hey, I’m also using a bunch of time in the development stage, and you may just desire to shoot me some money as appreciation of that.

In any case, I have made a Liberapay team for Clew. You’re welcome to donate there, should you feel so led.

other questions

If you have other questions, feel free to contact me

thank you!

Overall, I just want to thank everyone who’s been following along with this journey for their support! When I started with a random statement of “I want to build a search engine” on Polymaths Alpha to grow into a passion project with dozens following along and cheering me on from the sidelines.

It’s been a wild journey, and I hope something wonderful comes out of it. And, of course, worst comes to worst, I’ve learned a lot and had fun, and that’s what’s most important to me with this project.



If you like the work I do, please consider supporting me on Liberapay!

Badge showing amount I earn per week
Badge showing how close I am to reaching my funding goal