The Foundations of Insight: Basis Functions

Blog Post / SI: Clarifying Basis Functions & The Physics of Line Fitting

Title: The Foundations of Insight: Basis Functions



Context: This is a Supplementary Information (SI) post intended to clarify concepts from previous lectures regarding basis functions, line fitting, and the physical interpretation of regression.



1. The "Transform" View of a Line

In previous discussions, we looked at the equation of a line (y=mx+b) strictly geometrically. However, it is more accurate to view this as a transform.

When we define a line, we are defining a rule for how to get to a point on the y-axis (upward direction) based on a position on the x-axis.

  • Rise (h) & Run (r): instead of just abstract slope, think of this as dy/dx.

  • The Intercept (b): The anchor point.

We aren't just drawing a shape; we are measuring quantities. If you use a meter stick and measure points at 0.5m, 0.75m, etc., those measurements are distributed along the axis. The "line" is our attempt to model the relationship between those distributions.

2. Basis Functions as "Physical Probes"

The confusion often lies in what a basis function actually does. It is best to think of a basis function (like a Gaussian or a specific polynomial term) not as a math abstraction, but as a spotlight or a filter.

  • The Gaussian Example: If you have a function like e−(x−5)2, it effectively "kills" the signal everywhere except near x=5.

  • The Multiplication: When you multiply your data line by this basis function, you are mathematically saying, "I only care about what is happening in this specific region."

If you sum up enough of these local "spotlights," you can reconstruct any complex curve. This is why the width (variance) of your basis function matters—it defines the "resolution" of your probe.

3. The Physics of Fitting: "Torque" on a Line

Why do we fit a line instead of just connecting the dots?

  • Connecting the Dots: This is Overfitting. It creates a model that is useless for prediction because it creates an arbitrary path between noise.

  • Line Fitting: This is Equilibrium.

Think of the line as a physical rod with a pivot point (Center of Mass). Every data point you measured exerts a "force" on that rod.

  • Distance from Line = Lever Arm: The further a point is from the line, the more "torque" it applies.

  • The Best Fit Line: This is simply the state of Mechanical Equilibrium where the sum of the squared torques is minimized.

When we view regression as a system of forces rather than just "arbitrary numbers," the math becomes physical. We are finding the stable orbit for the data.

4. From Parabola to Polynomials

When the data isn't linear (e.g., a curved distribution), we don't throw away the physics; we just add more "gears."

  • Adding an x2 term allows the line to bend (Parabola).

  • Adding x3,x4… adds more inflection points.

However, as demonstrated in the video, trying to fit these by hand (adjusting h and r manually) is difficult and prone to error. This is exactly why we use algorithms (Gradient Descent)—to automate the "balancing of the torques" that we struggle to do by eye.



Summary: Basis functions are the building blocks. The Line is the structure. The Data provides the force. When you combine them, you aren't just doing math; you are finding the physical equilibrium of the system.


Connect & Support



AI Collaboration Note: This video, its title card, description, and the concepts explored within were developed in a deep, recurrent collaboration with Google Gemini. Our process involves Gemini acting as a Socratic partner, a technical reviewer, and a creative collaborator, helping to refine, structure, and articulate the final concepts and this description. 

References: [1] Imran, Muhammad, and Norah Almusharraf. "Google Gemini as a next generation AI educational tool: a review of emerging educational technology." Smart Learning Environments 11, no. 1 (2024): 22. 

[2] Tél, T., & Gruiz, M. (2006). Chaotic Dynamics: An Introduction Based on Classical Mechanics. Cambridge University Press.

[3] Arfken, G. B., Weber, H. J., & Harris, F. E. (2013). Mathematical Methods for Physicists. Academic Press.

[4] Boas, M. L. (2006). Mathematical Methods in the Physical Sciences. John Wiley & Sons.

[5] Marquardt, F., and Marquardt, F., 2021, "Machine learning and quantum devices," SciPost Physics Lecture Notes, p. 29. 

[6] Shannon, C. E., 1948, "A mathematical theory of communication," The Bell system technical journal, 27(3), pp. 379-423. 

[7] Shiffman, D. (2024). The nature of code: simulating natural systems with javascript. No Starch Press. 

[8] Landauer, Rolf. "Information is physical." Physics Today 44, no. 5 (1991): 23-29.

Previous
Previous

The Virtual Optical Bench (Thin Lens)

Next
Next

The Socratic Code Review: Can AI Predict Physics?