POS_Tagging_HMM

An implementation of Hidden Markov Model for the purpose of Part-of-speech tagging.

Problem Statement

Write a Hidden Markov Model part-of-speech tagger for Catalan. The training data is provided tokenized and tagged (present in hw5-data-corpus); the test data is provided tokenized, and your tagger will add the tags.

Data Format

A file with tagged training data in the word/TAG format, with words separated by spaces and each sentence on a new line.
A file with untagged development data, with words separated by spaces and each sentence on a new line.
A file with tagged development data in the word/TAG format, with words separated by spaces and each sentence on a new line, to serve as an answer key.
A readme/license file (which you won’t need for the exercise)

Programs

You will write two programs: hmmlearn.py will learn a hidden Markov model from the training data, and hmmdecode.py will use the model to tag new data. The learning program will be invoked in the following way:

python hmmlearn.py /path/to/input

The argument is a single file containing the training data; the program will learn a hidden Markov model, and write the model parameters to a file called hmmmodel.txt. The format of the model is up to you, but it should contain sufficient information for hmmdecode.py to successfully tag new data. The tagging program will be invoked in the following way:

python hmmdecode.py /path/to/input

The argument is a single file containing the test data; the program will read the parameters of a hidden Markov model from the file hmmmodel.txt, tag each word in the test data, and write the results to a text file called hmmoutput.txt in the same format as the training data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
hw5-data-corpus		hw5-data-corpus
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
hmmdecode.py		hmmdecode.py
hmmlearn.py		hmmlearn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

POS_Tagging_HMM

Problem Statement

Data Format

Programs

About

Releases

Packages

Languages

prakarshupmanyu/POS_Tagging_HMM

Folders and files

Latest commit

History

Repository files navigation

POS_Tagging_HMM

Problem Statement

Data Format

Programs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages