• checking keywords concerning evil

    From Jivanmukta@jivanmukta@poczta.onet.pl to comp.programming on Tue Oct 31 12:03:02 2023
    From Newsgroup: comp.programming

    I programmed in C++ obfuscator of PHP. I want to check in C++ if
    obfuscated project contains pornography, satanism, drugs, violence, prostitution etc. (I don't want to obfuscate such projects). How to do
    it? How can I get a database of such kewords (best would be in English,
    but the more langauges the better).
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Jivanmukta@jivanmukta@poczta.onet.pl to comp.programming on Tue Oct 31 16:05:43 2023
    From Newsgroup: comp.programming

    On 31.10.2023 12:03, Jivanmukta wrote:
    I programmed in C++ obfuscator of PHP. I want to check in C++ if
    obfuscated project contains pornography, satanism, drugs, violence, prostitution etc. (I don't want to obfuscate such projects). How to do
    it? How can I get a database of such kewords (best would be in English,
    but the more langauges the better).
    I found on GitHub List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words-master which
    contains files with bad words, one file for each language.
    But I am not sure how to program my algorithm. For example website with
    single occurence of word 'sex' is acceptable, but website which contains
    20% of words to be bad words is not acceptable.
    Do you have an idea of an algorithm for my problem?
    I have some idea but I am not sure if it is OK:
    threshold_percentage = 2/3 * avg_percentage_of_bad_words_for_set_of_sample_bad_websites

    --- Synchronet 3.20a-Linux NewsLink 1.114