Charactar normalisation service that renders unicode confusables and send back the string via ocr and makes a judgement about profanity.
There is here a sample Postman collection.
To test it on your local machine just forward the service to your localhost and try the examples.
Current integration environment access is needed:
kubectl -n lector port-forward service/lector 8000:8000
Sample payload:
{"toCheck": "ꜰᴜᴄᴋ ᴍᴇ"}
Sample Response
{
"ocr": {
"string": "FUCK ME",
"profan": true
},
"raw": {
"string": "ꜰᴜᴄᴋ ᴍᴇ",
"profan": false
},
"transcribed": {
"string": "ꜰucĸ ʍᴇ",
"profan": false
}
}
Response struct:
type Response struct {
Ocr struct {
String string `json:"string"`
Profan bool `json:"profan"`
} `json:"ocr"`
Raw struct {
String string `json:"string"`
Profan bool `json:"profan"`
} `json:"raw"`
Transcribed struct {
String string `json:"string"`
Profan bool `json:"profan"`
} `json:"transcribed"`
}
Confusbales in unicode are characters that look a like another one.
http://www.unicode.org/Public/security/latest/confusables.txt
If you like to try more sophisticated strings you can create one on your own here
One possible answer would be this lector service.
Credits to: