My Experience at Planck AI

This internship gave me the opportunity to work through the full pipeline of building multiple machine learning models and deploying the models to an application accessible to users. Throughout my work on this internship I have had opportunities to:

Managed end-to-end AI model development, from data extraction and cleaning to fine-tuning LoRA and RAG pipelines, ensuring consistent progress.
Developed and fine-tuned transformer-based models (Phi-2, T5) using LoRA and RAG pipelines to enhance legal document comprehension and question-answering performance.
Engineered an ETL data pipeline to parse, clean, and format legal contract data (CUAD dataset) into JSONL for model training and evaluation.
Integrated TF-IDF and Logistic Regression models into a Streamlit web app, enabling interactive document classification, clause tagging, and analytics visualization.
Designed and optimized a LangChain RAG pipeline, improving document retrieval efficiency and model response accuracy through customized prompt templates and vector stores.
Researched transformer architectures and model evaluation techniques, implementing insights into fine-tuning strategies and hyperparameter optimization for improved model performance.
Collaborated in weekly cross-functional meetings, proposing benchmarking strategies and leading discussions on data labeling formats, LoRA integration, and evaluation benchmarks.