Posts

Paper Explainer: Geometry and Stability of Supervised Learning Problems

Image
Just released a new paper ! In it, my coauthors and I try to make sense of some challenges in machine learning by creating a "space of all problems". If you don't know what that means, that's okay! This post explains the big ideas for non-mathematicians.   What is Supervised Learning? Suppose you've got some data on the IQ and SAT scores of a bunch of people, and the data looks like this:   (Note: I made this data up. Don't believe it.) Using this data, can you use someone's IQ score to get a rough estimate for their SAT score? Sure, you could fit a trendline to the data using some good ol' linear regression. It'll look something like this: Now if you know someone's IQ (say, 110), you can predict what their SAT score might be using the trendline (in this case, about 1207). Congratulations! You just took part in supervised learning ! You used an algorithm to... take data about the relationship between two variables $x$ and $y$, and use t

What Does an OSI Layer Even Mean?

I had a tough time wrapping my head around the OSI network model. In my experience, the layers are always presented just as layers, without any explanation of what a layer actually represents or what it means for one layer to sit on top of another, or even what is being "modeled" by this model. Instead, people just say "Layer 4 includes things like TCP and UDP, it worries about letting processes talk, it introduces ports, yada yada yada." Eventually I figured it out, and in this post I want to lay out what I think is a useful perspective: a layer is a level of abstraction that solves a specific problem. (Note: I'll be talking about the five-layer OSI model, as opposed to the more detailed seven-layer model. While the specific layers are different, the ideas are the same.) What is a Layer? Consider two computers communicating through a network. Maybe they're sending text messages, or video chatting, or requesting and producing webpages. There is a deep stack

The Logarithmic Black Sheep

Image
  Behold, the integral power rule: For any real $p$, $$\int x^p\, dx = \frac{1}{p+1} x^{p+1} +C$$  Dependable. Ubiquitous. Cursed with an asterisk: ... unless $p=-1$,  in which case $$\int x^p\, dx = \ln(x)+C.$$ I have gotten used to this fact through years of working with integrals. All the same, it bothers me. Every time I am forced to check if my exponent falls into this one exceptional case, a voice somewhere in my head protests. "How does this make sense?? How do you get a continuous family of monomials except for the one case where, of all things, you get a logarithm!?" This is not to say I don't understand why $\int dx/x$ spits out a logarithm. I can see that the usual formula would force you to divide by 0 at $p=-1$, and I could even prove the antiderivative in a few lines. It's just that it feels like an unexplained discontinuity at a point. I never learned how this logarithmic black sheep fits in with the big happy family of monomials.     Fitting the

The Two Guards Riddle: A Couple of Insights

Remember that riddle with the two doors and the two guards, where one always lies and one always tells the truth? I was reminded of it when it came up in a podcast recently. After some thought, I had a couple of insights. Just to recap, the riddle goes like this. You are trapped in a room with two exits. One leads out to freedom, while the other leads to certain death. You can't tell the difference without taking the risk and stepping through a door. There are two guards in the room with you. These guards will happily let you step through a door. They have a peculiar quirk: one guard always lies, while the other always tells the truth, but again you don't know which is which. They know which door leads out and which leads to death. You can ask ONE of the guards ONE yes-or-no question. Your goal, of course, is to use that one question to determine which door you should pass through and avoid certain death. An example question that would not work: "Does the left door lead

"Why Do I Need Complex Numbers?"

I have heard students ask this question over and over in one form or another. A recent incarnation was featured in this video by 3Blue1Brown. I think this version strikes at the heart of the vendetta that many students have against complex numbers. "What is an application that is impossible to achieve without complex numbers? Convince me that they are needed. They are fun to work with, and maybe they make things easier, but we can do without them." (Edited for grammar and spelling.) At the heart of this question is a reasonable request. When introduced to a new mathematical concept, we should demand reasons, even if the answer is necessarily a bit delayed. High school students in the US are often introduced to complex numbers with little motivation. To them, this new number system can seem like a lot of complexity with little payoff. Maybe a few applications are presented, but one might argue they could be achieved without complex numbers, even if it requires sacr

Fight Functors with Fire

Once during undergrad, my topology professor was trying to prepare the students for an exam. He was an interesting guy. His look and disposition were unusually relaxed, and his clothes looked more like those of a teenager on summer vacation those of a professor. Typical attire included shorts, flip flops, and a t-shirt referencing a cartoon from the 90's. Less than 30 years old, he was by far my youngest instructor. He often told stories of his own mathematical struggles in grad school, which always seemed to end with his adviser yelling at him. No one could fault his enthusiasm; his excitement about topology was too pure for this world. I always enjoyed his lectures. Needless to say, the students loved him. To p repare for the upcoming exam, he'd collected some practice problems. Before diving in, he devoted a few precious minutes of lecture time to reading us a quote that he thought perfectly crystallized  the way we should learn mathematics. "Don't just read it;