-
Notifications
You must be signed in to change notification settings - Fork 77
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
45 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,49 @@ | ||
# NLP: Zero to Hero | ||
# Natural Language Processing: Zero to Hero! | ||
<br> | ||
Welcome to Theory and Hands-on experience of NLP. | ||
|
||
In this repository, I've tried to cover nearly all the topics in NLP, starting from Tokenization to the Transformer Architecuture. By the time you finish this, you will have a solid grasp over the core concepts of NLP. | ||
|
||
**The repository will help you learn NLP with the motive to make you understand why things/concepts are the way they are. | ||
|
||
|
||
|
||
|
||
|
||
In this repository, I've covered almost everything that you need to get started in the world of NLP, starting from Tokenizers to the Transformer Architecuture. By the time you finish this, you will have a solid grasp over the core concepts of NLP. | ||
|
||
The motive of this repository is to give you the core intuition and by the end of this you'll know how things evolved over the years and why they are the way they are. | ||
|
||
|
||
![alt text](./assets/hero.png) | ||
|
||
*Image Generated by Ideogram* | ||
|
||
# Table of Contents | ||
### 1. [Tokenization](./Notebooks/01_Tokenization.ipynb) | ||
### 2. [Preprocessing](./Notebooks/02_Pre_Processing.ipynb) | ||
### 3. [Bag of Words and Similarity](./Notebooks/03_BOW_Similarity.ipynb) | ||
### 4. [TF-IDF and Document Search](./Notebooks/04_TFIDF_DocSearch.ipynb) | ||
### 5. [Naive Bayes Text Classification](./Notebooks/05_NaiveBayes_TextClf.ipynb) | ||
### 6. [LDA Topic Modelling](./Notebooks/06_LDA_TopicModelling.ipynb) | ||
### 7. [Word Embeddings](./Notebooks/07_Word_Embeddings.ipynb) | ||
### 8. [Recurrent Neural Networks (RNNs) and Language Modelling](./Notebooks/08_RNNs_LMs.ipynb) | ||
### 9. [Machine Translation and Attention](./Notebooks/09_Machine_Translation_Attention.ipynb) | ||
### 10. [Transformers](./Notebooks/10_Transformers.ipynb) | ||
|
||
# How do I use this repository? | ||
* Considering the computational power required for ML and DL, it is advised to use Google Colab or Kaggle Kernels. | ||
* You can click on <a target="_blank" href="https://colab.research.google.com/github/JUSTSUJAY/NLP_One_Shot/blob/main/Notebooks/01_Tokenization.ipynb"> | ||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> | ||
</a> to open the notebook in Colab. | ||
* You can click on [![Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/JUSTSUJAY/NLP_One_Shot/blob/main/Notebooks/01_Tokenization.ipynb) to open the notebook in Kaggle. | ||
* For some of the notebooks, Kaggle datasets are used, and some of them are in Gigabytes. | ||
* For quicker loading of those datasets, it is advised to open them in Kaggle using corresponding tags. | ||
* Opening the Kaggle Kernel does not directly attach the dataset required for the notebook. | ||
* You are required to attach the dataset whose link has been provided in the respective notebooks, which you will find as you progress through them. | ||
* Start with the `Tokenization` Notebook and move forward sequentially. | ||
* Take your time to understand the concepts and code. It is specifically designed to be easy to understand and to be done at your own pace. | ||
* Make sure you have a basic understanding of Python programming before starting. | ||
* If you encounter any issues or have questions, feel free to open an issue in the GitHub repository. | ||
* Don't forget to star the repository if you find it helpful! | ||
|
||
# Contributing | ||
You are more than welcome to contribute to this repository. You can start by opening an issue or submitting a pull request. If you have any questions, feel free to reach out to me on [X](https://x.com/sujay_kapadnis) | ||
|
||
If you have any resources that you think would be helpful for others, feel free to open an issue or submit a pull request. | ||
|
||
# License | ||
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.