Using a Private, Local AI as a Physics Research Assistant

Aug 25

Researchers have begun exploring the use of Large Language Models as automated research assistants in many domains [1]. Here, we explore a more specific application: utilizing a private, locally-run model as a direct collaborator in the field of optical physics. The debate around AI in research often focuses on public, cloud-based models. But for scientists concerned with intellectual property, privacy, and data control, the real revolution is happening locally. I've been incorporating a private, locally-run Large Language Model into my daily workflow, not as a simple chatbot, but as a true research collaborator. In this video, I'll give you a look inside that process.

A Researcher's Deep Dive: The First-Principles Justification

Modern physics research presents a fundamental conflict. High-precision experiments often demand controlled environments where all sources of noise—electromagnetic, vibrational, and acoustic—must be eliminated. This has historically forced a trade-off between experimental precision and real-time information access. Here, we explore a new solution: utilizing a locally-run, generative AI model on a silent, portable device. With modern MacBooks, iPads, and iPhones now running on the same powerful Apple Silicon platform, it's possible to deploy the same powerful AI model across a researcher's entire ecosystem. This provides access to a saved state of sparsely sampled data from the internet and literature without the physical or electronic noise of a live network connection.

This ability to operate from a "saved state" is made possible by techniques rooted in the physics of information. The principles of entropy and sparse sampling, seen in methods from Shannon [2] to Monte Carlo, have given us profound methods for data compression [3]. Modern machine learning models, particularly regenerative architectures like Transformers, are built on these same principles, such as minimizing Helmholtz free energy [4]. They are capable of learning from a sparsely sampled dataset and then reconstructing vast amounts of information from those compressed patterns.

This public, notebook-style research serves three primary goals. First, it aims to correct the record on common misinterpretations of AI behavior, such as "confabulations" [5]. Second, it is intended to be a learning space for other researchers, providing a transparent look into a process that will eventually translate into STEM education. Finally, we will deliberately explore this model outside of its typical boundary conditions. For a physicist, understanding how a system behaves at its limits is the most fundamental test of its validity, ensuring we are observing true extrapolation, not just memorization [6].

This leads to a crucial insight for the future of collaborative AI. Just as two human experts communicate, they can infer meaning and average out minor errors because they share a vast amount of correlated information. An AI should not be used as an expert source without a similar awareness that its conversational partner has their own knowledge and biases. True AI alignment may therefore require models that can adapt their certainty based on their user's expertise. The AI should know whether it's talking to a student or a professor, enabling it to say, "I'm not sure, this is questionable, let me check my resources"—as any good teacher or research collaborator does.

References

[1] Schmidgall, S., Su, Y., Wang, Z., Sun, X., Wu, J., Yu, X., ... & Barsoum, E. (2025). Agent laboratory: Using llm agents as research assistants. arXiv preprint arXiv:2501.04227.

[2] Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423.

[3] Brunton, S. L., & Kutz, J. N. (2022). Data-driven science and engineering: Machine learning, dynamical systems, and control. Cambridge University Press.

[4] Hinton, G. E., & Zemel, R. (1993). Autoencoders, minimum description length and Helmholtz free energy. Advances in Neural Information Processing Systems, 6.

[5] Moscovitch, M. (1995). Confabulation. In Schacter, D. L. (Ed.), Memory distortion: How minds, brains, and societies reconstruct the past (pp. 226-251). Harvard University Press.

[6] Perdue, G. N., et al. (2018). Reducing model bias in a deep learning classifier using domain adversarial neural networks in the MINERvA experiment. Journal of Instrumentation, 13(11), P11020.

David Hoxie

Using a Private, Local AI as a Physics Research Assistant

A Researcher's Deep Dive: The First-Principles Justification

References

The Foundations of Insight: Basis Functions

Connecting AI and Physics