Alexandra Ralevski's Portfolio

Project 1: Using Large Language Models to Identify Housing Insecurity

Led a Generative AI team in collaboration with Providence Health to use Large Language Models to extract complex unstructured data from 25,217 notes from 795 pregnant patients.
Used Chain-of-Thought prompting and few-shot learning to extract SDoH (Social Determinants of Health) data including housing insecurity with higher recall than human annotators (0.92) and a precision of 0.85.
More efficient methods for obtaining structured SDoH data can help accelerate inclusion of exposome variables in biomedical research, and support healthcare systems in identifying patients who could benefit from proactive outreach.

Comparison of recall and precision for Regex, GPT-3.5, GPT-4, and manual annotation in identifying notes with current or past housing instability, measured on 539 manually annotated notes.

Project 2: AI-Based Risk Prediction of CKD Patients

Developed an XGBoost model with an AUROC of 0.83 to identify patients at high risk of CKD (Chronic Kidney Disease).
Developed CROP (Clinical Recruitment Optimization Pipeline), a statistical method that improves ML predictions to identify high-risk patients for clinical trials.
Use of CROP & CKD risk model resulted in a six-fold decrease in the total number of patients needed for clinical trial recruitment.

CROP procedure of adjusting model probabilities and estimating the number of transitions under the Poisson model.

Project 3: Generating Training Data Using Weak Supervision with Snorkel (NASA)

Utilized the Snorkel system to build a training set of labeled biomimicry papers.
Labeling our data by hand was prohibitively slow, so we turned to a weak supervision approach using labeling functions (LFs) in Snorkel.
LFs are noisy, programmatic rules and heuristics that assign labels to unlabeled training data.
We successfully trained a classifier that could predict what label a certain biomimicry paper should receive with 95% accuracy.

An overview of the Snorkel system. (1) Subject matter experts (SME) users write labeling functions (LFs) that express weak supervision sources like distant supervision, patterns, and heuristics. (2) Snorkel applies the LFs over unlabeled data and learns a generative model to combine the LFs' outputs into probabilistic labels. (3) Snorkel uses these labels to train a discriminative classification model, such as a deep neural network. Adapted from Ratner et. al (2017).

Contact

For questions contact Alexandra Ralevski ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
images		images
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Alexandra Ralevski's Portfolio

Project 1: Using Large Language Models to Identify Housing Insecurity

Project 2: AI-Based Risk Prediction of CKD Patients

Project 3: Generating Training Data Using Weak Supervision with Snorkel (NASA)

Contact

About

Releases

Packages

ARalevski/My_Portfolio

Folders and files

Latest commit

History

Repository files navigation

Alexandra Ralevski's Portfolio

Project 1: Using Large Language Models to Identify Housing Insecurity

Project 2: AI-Based Risk Prediction of CKD Patients

Project 3: Generating Training Data Using Weak Supervision with Snorkel (NASA)

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages