This repo provides a comprehensive overview of ELMo (Embeddings from Language Models), a deep contextualized word representation model, developed by researchers at the Allen Institute for Artificial Intelligence (AI2) in 2018. Unlike traditional word embedding models, ELMo excels at capturing the contextual meaning of words within sentences, significantly enhancing performance across various natural language processing tasks. This report delves into the architecture of ELMo, highlighting its bidirectional Long Short-Term Memory (Bi-LSTM) networks, training objectives, and its unique approach to representing words in context. Furthermore, it discusses the pretraining process and the subsequent fine- tuning of ELMo for specific downstream tasks, such as sentiment analysis and natural language inference. Lastly, it emphasizes the role of stacked Bi-LSTMs in capturing the intricacies of language, from syntactic to semantic aspects, and how they contribute to ELMo's effectiveness in various applications.
Please refer the jupyter code to the website linked for project details.