This page contains thoughts more akin to a collection of interesting links than formal reviews. Books are now on a separate page.
(This is the kind of thing they ask in interviews!)
My brother-in-law told me recently that interviewers are asking "transformers" questions, which boggles the mind a little, but was apparently confirmed by Sasha. One more thing for the poor CS undergrads to learn now, on top of dynamic programming and Djikstra's. I loved this post for explaining a fairly complex topic with excellent diagrams.
This is a really, really well written piece of long-form journalism. Since it's only early January I'll call it my favorite piece of writing this year without too much thought. If anyone knows of other writers like him, I'd love to read more in this style.
In general Ben's blog posts are a huge source of insight for me, and this is another great one. One surprising part of growing up is realizing how easy it is to become pretty good at something and that a lot of "efficient markets" aren't actually efficient. I also think that there's a lot of thinking that looks like "smart people have probably thought about this already", but in reality smart people are apparently pretty busy optimizing ads and making the market for single-name equity options liquid.
Eric has a knack for communicating research results clearly (often with a lot of humor) and this is a great piece on how the world of machine learning has subtly shifted in the past few years. One piece of "folk knowledge" from the mid-2010s is that surrogate losses don't work particularly well, and that you should directly optimize (with gradients) your metric of interest. Eric roughly makes the point that some metrics are hard to optimize for (e.g. "optimal policy") but there are sometimes more general (but easier to fit) objectives that can be used like a lookup table to find the specific solution you're looking for. Instead of trying to predict which team will win a specific NBA game, optimize a function that computes the expected points per game of any set of players, and then run that function on both teams. The latter model can be trained on a much broader set of data points, and you get the specific result "for free".