I worked with Chen Shani and Dan Jurafski to trace concepts in language models.
A strategy to obtain a concept from a language model is to apply a linear probe with concept weight \( \mathbf{w} \) and bias \( \mathbf{b} \).
\[
\mathbb{P}(s \in C) = \sigma(\mathbf{w}_{c}^{\top} \mathbf{h}_s + \mathbf{b})
\]
This defines a curve over context e.g., each prefix is associated with a mixture of concepts.
We write the average as an incremental update and view it as a discretized differential equation.
This lets us study the behavior of concept space e.g., fixed points, bifurcation, etc.
Data Director
Stanford Daily
2024
The Stanford Daily is the independent, student-run newspaper of Stanford University.
The Daily strives to serve the Stanford community with relevant, unbiased journalism and provides its editorial, tech and business staffs with unparalleled educational opportunities.
As the Data Director, I oversee data section recruitment, training, project management, and associated institutional knowledge.
Technologies:
GitHub
Flourish
D3
Data Analyst
Veteran Affairs
2023
The Department of Veteran Affair's Program Evaluation and Resource Center (PERC) provides program evaluation and technical assistance for mental health quality improvement efforts across the Veteran's Health Administration.
Our objective is to develop an approach for categorizing narrative content about mental health concepts into operationally relevant categories.
Given the
burgeoning threat
of xylazine-laced fentanyl, we used xylazine as a case example.
Extracted snippets from Text Integrated Utility (TIU) Notes
Developed a custom tokenization procedure with built-in text repair processes
Machine unlearning, the ability for a model to "forget" a subset of its training data, may help scrub bias and safeguard user privacy.
We implement a student-teacher unlearning framework that rewards similarity on a retain set and penalizes similarity on a forget set.
Computational Geometry is a branch of computer science dedicated to the study of algorithms stated in geometric terms.
I wrote efficient ports of QuickHull and PolyLabel in NumPy.
The QuickHull Algorithm extends to arbitary dimensions and has 2D/3D visualization support.
The PolyLabel Algorithm extends to any complex polygon.
Contributed as
ConvexHull
and
LabeledPolygram
for
ManimCE,
an open source graphics software with over 20K GitHub Stars.
Understanding the relationship between courses is of interest to a University's educational mission.
We employ a range of supervised and unsupervised methods to understand the Stanford educational structure.
Contributed as Prerequisite Tree feature for
OnCourse,
a course-planning startup with over 5,000 users.
The challenge involves creating a probability driven project of ones choosing that highlights concepts from the class and does something interesting.
My project
discusses the Hilbert Space of Random Variables.
PVSA Gold
This award honors individuals whose service positively impacts communities and inspires those around them to take action, too.
Skills
Python
C
C++
R
SQL
HTML
CSS
JS
NumPy
Pandas
SciPy
MatPlotLib
PyTorch
OpenCV
D3
Quarto
LaTex
Communication
At heart, I am an explainer and entertainer.
I have posted hundreds of solutions on r/LearnMath.
I have blogged about my projects and interests over the years.
I run
Lyte Lectures,
a 1.5K strong YouTube channel that seeks to communicate ideas with an accessible approach and artful style.