Natural-Language-Processing

-> BiGrams

Compute the Bigram Model(counts and probabilities) on a given corpus for following three scenarios:

No Smoothing
Add-one Smoothing
Good-Turing Discounting based Smoothing

Run Command: python BiGrams.py <path of the file CorpusTreebank2.txt>

-> Brills

Implement Brill’s transformation-based POS tagging algorithm using ONLY the previous word’s tag to extract the best transformation rule to:

i. Transform “NN” to “JJ”

ii. Transform “NN” to “VB”

Using the learnt rules, fill out the missing POS tags (for the words “standard” and “work”) in the following sentence:

The_DT standard_?? Turbo_NN engine_NN is_VBZ hard_JJ to_TO work_??

Run Command: python Brills.py <path of the file POSTaggedTrainingSet.txt>

-> POS Probabilities

Compute the bigram models (counts and probabilities) required by Naïve Bayesian Classification (Bigram) based POS Tagging.

Run Command: python POS_probs.py <path of the file POSTaggedTrainingSet.txt>

-> ViterbiHMM

The Viterbi algorithm to compute the most likely weather sequence and probability for any given observation sequence. Example observation sequences: 331, 122313, 331123312, etc.

The Given HMM:

Run Command: python ViterbiHMM.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Natural-Language-Processing

Files

README.md

Latest commit

History

README.md

File metadata and controls

Natural-Language-Processing