Hey Knitting Nancies!
Hard stuff
Modules, amiright? Lots of details to consider, but really I went over and over it again cutting down the implementation. Less is more! I think what is there is enough to unblock other interesting use cases, without making a bunch of assumptions that might need to be unwound.
The trailer park now runs on Knit! Now I should add some new movies. So I’m excited because this pipeline parameterizes the module that stores the videos. Switching the module takes a few lines, and instead of saving the videos on my laptop I can upload them to an ssh server. (My slogan should be ssh-ake and Make.) There’s interesting possibilities, like running the same pipeline on S3 prod data or local test data. Or being able to spin up a new environment in GCP by plugging in a different store.
I want to start thinking about runtime dependencies like virtualenv. I may also play around with a postgres store or practical partitioning.
Scientific stuff
I went down a rabbit hole. It was unintentional but interesting.
More developer survey results, for Julia users this time! Interesting to see how scientific computing skews.
- 60% of users are academics. 3% (!) identify as women.
- bash/shell is the third most used language, with 22% saying they use it a great deal. SQL is 12%.
- C/C++ are the third most liked languages, behind Python, and ahead of R, MATLAB, and bash. SQL is half as liked as C.
- The most popular feature of Julia is performance (recall that was the most desired Python feature).
- Half of folks only run locally. There is no dominant hosting provider, but the plurality run on their own clusters. (A third of users complained about Julia’s inability to compile to a self-contained binary.)
- OS popularity, descending: Linux, Windows, Mac.
Singularity is the Docker for scientific computing. Snakemake is the Airflow of scientific computing. Getting your paper cited is the marketing of academia.
Pluto (this page feels broken) is the up-and-coming notebook for the Julia world; it is inspired by the Javascript notebook Observable. The key innovation is to use a DAG to model dependencies between cells. So when any cell is updated, the rest of the notebook updates reactively with minimal recomputation. Besides avoiding issues of stale notebook state, this also allows interactivity with Excel-level simplicity.
Julia launched new package and artifact systems last year. Because modules and packaging are hard.
ssh-ake and Make!