Three interesting things - 20231213

Dec 13, 2023

Interactive visualization of LLM architectures

https://bbycroft.net/llm by Brendan Bycroft

This is a really cool visualization! It does not only allow you to explore the architecture with zoom & rotation, but also has a detailed explanation of how information flow through the network, with a high-quality animation. Although it still requires some time to understand what’s going on in the visualization, it is extremely well done.

You can also check out source code: https://github.com/bbycroft/llm-viz

Pre-registration for predictive modeling

An interesting proposal to pre-registering predictive modeling (research) by @jakehofman et al.

https://arxiv.org/abs/2311.18807

Why do you need this? Although predictive (vs. explanatory) modeling is less prone to p-hacking due to the requirement to test on out-of-sample data, there are still many factors and choices that can make it vulnerable. Many predictive modeling tasks not only involve a lot of potential predictors but also require exploratory data analysis (EDA) and iterative model updates to arrive at good models. This can sometimes lead to overfitting and inappropriate re-use of data. Furthermore, there are still lots of researcher-degree-of-freedom like the choice of evaluation metrics, hyperparameters, etc.

The key idea from this paper is to do a two-step pre-registration: the first at the time of declaring the problem and the second after finishing the training of the model (with the model details and evaluation criteria). The first step would help clarifying and setting the problem concretely and the second step can ensure good generalizability.

llamafile

llamafile is a project by Mozilla Ocho (“Innovation and Experiments @ Mozilla”) that aims to distribute LLMs (and beyond?) as a highly-optimized single executable. First heard from Simon Willison but Kevin’s post finally nudged me to try it out.

Kevin’s digital whiteboard

Running LLMs on your desktop with one click

Simon Willison mentioned this a few days ago. I didn’t take it seriously—until today. A project called llamafile makes it insanely easy to run LLMs on your local machine. You just need to download a file and then run it. That’s it! They provided a few different models. I tried the LLaVA 1.5 model, a multimodal model on my M2 Mac mini with 8G RAM, and the…

2 years ago · Kevin Yang

I tried on my MacBook Pro and the ease and speed is impressive! You can run 7B parameter models no problem.

YY’s Newsletter

Discussion about this post