Knit picks: Feb 19

2021/02/19

Hello friends of Knit! It is the first email update hurrah. What I’m hoping for:

Accountability. Duh.

Structure. Covid days and covid nights, amiright? I hope checking in helps focus each week.

Articulation. I want to exercise putting thought to paper and see if it helps pitching, etc.

Feedback. Putting things out there. Inviting any nit picks! ;)

Code stuff

My immediate goal is to host my movie trailer pipeline on Knit. Because don’t you write a data-driven make file web scraping pipeline when you’re deciding what movie to watch with your family? It has some properties that make it a nice mini stress test. It is end-to-end with data scraping and integration (IMDB and YouTube), some rudimentary analysis (scoring trailers), and presentation. A sub-pipeline is instantiated for each movie trailer, so it uses pipeline composition. Video files are kind of large. It runs on a server so Knit needs to be somewhat conveniently distributed. I’ll be spending a fair amount of time refactoring pipeline composition.

A medium term goal is to split Knit up into a “frontend” and “backend.” The backend is the core data structures for representing data pipelines, and how they are executed. The frontend is how end users will express their data pipelines. There are actually a few frontends implemented right now. Two are custom file-based text formats (unit and template files), and there is also partial support for make files, and two prototypes of iPython-style “notebooks.” The goal is to support multiple frontends and allow pipelines to be composed across different frontends (part of it can be written as a notebook, part as make file, etc). The hypothesis is that not everyone thinks about data pipelines in the same way, so we want to support whatever mode makes the most sense in any given situation.

The milestone event will be some sort of release announcement. Hacker news or Reddit or whatever. Mostly TBD still, working on recruiting some early users first.

Soft stuff

I started to actively outreach a couple weeks ago. The personas I’ve identified are cofounder, early users, industry “insiders,” and founder “advocates”. Insiders are people working in and following the data space that I can talk shop and trade ideas with. Advocates are generally other founders to commiserate over the founder journey with.

I’m trying Notion to organize contacts and issue tracking. I’m not sure I like it for issue tracking but I haven’t found another system I liked. For now it works and I like being able to tweak it for my workflow. Tangentially, I really wanted to like Fibery. It fits how my brain works. They also have a refreshingly candid blog. It’s just a little too clunky.

The snowstorm forced me out of the house into a coworking space downtown. I’ll probably continue to drop by when I really want to focus. Need to figure out traffic though, the commute sucks!

Read stuff

I’ve been reading The ZeroMQ Community. Building an open source community is kind of more intimidating to me than building a company. I’m leaning into it though, because my hypothesis is that for Knit’s vision to succeed, it needs to be extremely widely distributed with a broad user community. It is making me consider changing from MIT/BSD to a GPL license:

The license we choose modifies the economics of those who use our work. In the software industry, there are friends, foes, and food. BSD makes most people see us as lunch. Closed source makes most people see us as enemies (do you like paying people for software?) GPL, however, makes most people [. . .] our allies.

Thanks for following along!