Answer-Script-Evaluation

This application evaluates student answer scripts using Natural Language Processing (NLP).

For the project, we have taken huge data for training. We have trained the model with 66Gb of text data (wiki corpus) (https://dumps.wikimedia.org/enwiki/) that has almost all the words with its nearby predictive words. This huge data is fed to Latent Semantic Indexing and we get a LSI model. This model has a size of about 2.5 GB and this model contains all the data in numerical form. This step is performed because machine learning algorithms can only understand numbers and not text. To work with text data in machine learning, we convert all the text data to a numerical form that is understood by the machine learning algorithms.

After the LSI model is created, the making scheme is taken and stop words are removed from it and converted to dictionary. This dictionary contains all the unique words and these words are mapped to some unique numbers. The dictionary contains these unique numbers as keys and the occurrence of those words as values. This dictionary is then converted to bag of words. The same procedure is applied to the answer sheet as well. The bag of words that was created by the marking scheme is passed to the LSI model and the data is prepared for comparison and we get a set of indexing values which eases the comparing process.

Now the answer sheet’s corpus is taken and is passed through the LSI model and set of values is generated. This set of values is again compared with the indexed values of the marking scheme. The LSI model compares these values from answer sheet and marking scheme and assigns a comparsion percentage. This process is applied to each sentence and each sentence will have a comparison score. The average of these comparison scores is taken for one answer and this average is the comparison score of that answer. Marks is assigned based on this compariosn score for each answers and after all answers are processed, marks are added up and total marks is obtained.

To construct a corpus from a Wikipedia (or other MediaWiki-based) database dump, refer this tutorial https://radimrehurek.com/gensim/corpora/wikicorpus.html.

For detailed technical information please refer my research paper - https://www.ijariit.com/manuscript/answer-script-evaluator/

Stages of evaluation:

1. Input Text Processing

2. Removing Stop Words

3. Bag of Words

4. Comparison of Text

5. Sentiment Analysis

6. Assigning Marks

The front end of this application looks like this.

The application after evaluating.

Output on the terminal.

Similarities with each sentence.

In the above figure, entities in the index 0 refers to first sentence in the answer sheet. The comparison scores written against it indicates the index number, similarity with that of the marking scheme sentences. Here 0 , 0.9099441 indicates that sentence 0 of answer sheet has a comparison percentage of 90% with sentence 0 of marking scheme.

Similarly, the second entity 5, 0.68514407 indicates that sentence 0 of answer sheet has a comparison percentage of 68% with sentence number 5 of marking scheme.

Similarly, the entire set of sentences are listed for each answer script. For each index of answer script, the marking scheme index with maximum comparison percentage is selected and assigned as most similar with the corresponding comparison percentage.

One of the sample answer to a question from student's answer script.

In the above answer, each sentence in the answer is again split to individual list items. So there are 6 sentences ranging from index 0 to index 5.

Sample answer to a question from teacher's marking scheme.

In the above figure, again each sentence is split to a list item and as we can see, there are more number of sentences in the marking scheme when compared to the answer script. And it is also observed that the points are jumbled i.e., they are not in the same order in both the documents. Now, each sentence to sentence comparison is done.

Final similarity array for an answer.

As shown in the above figure, the first column refers to index of marking scheme. 2 nd column refers to index of the answer sheet and the last column refers to the comparison percentage of those 2 sentences. This is the process for one answer. For all the answers in the answer script, the same procedure is applied and comparison percentage is calculated individually.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
LSI_MODEL		LSI_MODEL
LsaProject.py		LsaProject.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Answer-Script-Evaluation

This application evaluates student answer scripts using Natural Language Processing (NLP).

Stages of evaluation:

1. Input Text Processing

2. Removing Stop Words

3. Bag of Words

4. Comparison of Text

5. Sentiment Analysis

6. Assigning Marks

The front end of this application looks like this.

The application after evaluating.

Output on the terminal.

Similarities with each sentence.

One of the sample answer to a question from student's answer script.

Sample answer to a question from teacher's marking scheme.

Final similarity array for an answer.

Marks allotted to every subsequent question number by a teacher.

This file contains question numbers attended by student.

About

Releases

Packages

Languages

Madhuri243/Answer-Script-Evaluation

Folders and files

Latest commit

History

Repository files navigation

Answer-Script-Evaluation

This application evaluates student answer scripts using Natural Language Processing (NLP).

Stages of evaluation:

1. Input Text Processing

2. Removing Stop Words

3. Bag of Words

4. Comparison of Text

5. Sentiment Analysis

6. Assigning Marks

The front end of this application looks like this.

The application after evaluating.

Output on the terminal.

Similarities with each sentence.

One of the sample answer to a question from student's answer script.

Sample answer to a question from teacher's marking scheme.

Final similarity array for an answer.

Marks allotted to every subsequent question number by a teacher.

This file contains question numbers attended by student.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages