CURIS Fellow
I worked with Chen Shani and Dan Jurafski to trace concepts in language models. A strategy to obtain a concept from a language model is to apply a linear probe with concept weight \( \mathbf{w} \) and bias \( \mathbf{b} \). \[ \mathbb{P}(s \in C) = \sigma(\mathbf{w}_{c}^{\top} \mathbf{h}_s + \mathbf{b}) \] This defines a curve over context e.g., each prefix is associated with a mixture of concepts. We write the average as an incremental update and view it as a discretized differential equation. This lets us study the behavior of concept space e.g., fixed points, bifurcation, etc.
Data Director
The Stanford Daily is the independent, student-run newspaper of Stanford University. The Daily strives to serve the Stanford community with relevant, unbiased journalism and provides its editorial, tech and business staffs with unparalleled educational opportunities. As the Data Director, I oversee data section recruitment, training, project management, and associated institutional knowledge.
Technologies:
- GitHub
- Flourish
- D3
Data Analyst
The Department of Veteran Affair's Program Evaluation and Resource Center (PERC) provides program evaluation and technical assistance for mental health quality improvement efforts across the Veteran's Health Administration. Our objective is to develop an approach for categorizing narrative content about mental health concepts into operationally relevant categories. Given the burgeoning threat of xylazine-laced fentanyl, we used xylazine as a case example.
- Extracted snippets from Text Integrated Utility (TIU) Notes
- Developed a custom tokenization procedure with built-in text repair processes
- FineTuned BioClinicalBERT weights
- Tested and validated model predictions
Technologies:
- Azure Data Studio
- SQL
- PyTorch
- HuggingFace