View Single Post
Old 03-20-2024, 04:33 PM   #20
lomkiri
Zealot
lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.
 
lomkiri's Avatar
 
Posts: 136
Karma: 1000102
Join Date: Jul 2021
Device: N/A
Quote:
Originally Posted by moldy View Post
To counteract this I tried wrapping John in \b anchors in the function
It should have worked (in a regex, but not with the python str.replace())
Quote:
I would like to go back to the dict method again (as described in lomkiri’s suggestion above).
Try this :
Code:
    # insert here the code to load the json file into the dict "equiv"
    # (see my post #12 for this code)
    import regex
    m = match.group() 
    for key in equiv:
        m = regex.sub(rf'\b{key}\b', equiv[key], m)
    return m
It works, I have tested it :
Johnson, Johnjo LongJohn and so on John and Ringo, and also john ==>
Johnson, Johnjo LongJohn and so on Mick and Charlie, and also john

Note: rf'\b{key}\b' is the same as r'\b{}\b'.format(key) and will be expanded to '\bJohn\b' if key == 'John'

It works with either <body[^>]*>\K(.+)</body> (with "dot all" checked) or >\K([^>]+)(?![^<>{}]*[>}]) (but the 1st form will be quicker, treating one whole html file at each iteration, with the condition, as I said above, that none of your keys will match something inside an html tag). The 2nd form will select the text between tags and avoid the part inside the tag.

Last edited by lomkiri; 03-21-2024 at 08:03 AM.
lomkiri is offline   Reply With Quote