Skip to content

Tokenising Requirements

JamesyJi edited this page Sep 2, 2021 · 1 revision

Overview

After the conditions have been processed and adjusted via our regex algorithm, the next step is to parse it into a list of tokens which our algorithm will then read from. Here is a rough overview including examples of the logic involved in this step.

  1. An opening and closing bracket is added to all conditions
  2. Split on (, ), &&, ||
  3. Split on keywords

Simple ||

Original: COMP1511 || DPST1091 || COMP1911 || COMP1917

Tokenised: [(, COMP1511, ||, DPST1091, ||, COMP1911, ||, COMP1917, )]

Notes: Split on ||

Simple &&

Original: COMP1511 && DPST1091 && COMP1911 && COMP1917

Tokenised: [(, COMP1511, &&, DPST1091, &&, COMP1911, &&, COMP1917, )]

Notes: Split on &&

Complex &&, ||, (, )

Original: (MMAN2400 || ENGG2400) && (MMAN2100 || DESN2000)

Tokenised: [(, MMAN2400, ||, ENGG2400, ), &&, (, MMAN2100, ||, DESN2000, )]

Notes: Split on all the key tokens

Simple "in"

Original: 24UOC in COMP

Tokenised: [(, 24UOC, in, COMP, )]

Notes: Treat "in" as a keyword

Complex "in"

Original: 96UOC in (COMP || SENG || MATH) && COMP1511 && COMP1521

Tokenised: [(, 96UOC, in, (, COMP, ||, SENG, ||, MATH, ), &&, COMP1511, &&, COMP1521]

Notes:: Treat "in" as a keyword