Anonymize (and then de-anonymize) comments in Word documents.

Piotr Czajkowski 611d4e01fd Correction %!s(int64=3) %!d(string=hai) anos
.github da92043ef1 Trying to fix it %!s(int64=3) %!d(string=hai) anos
testFiles 19ac79baa5 Better such tests than none %!s(int64=3) %!d(string=hai) anos
.gitignore 9a1c88ac44 Added .DS_Store %!s(int64=6) %!d(string=hai) anos
LICENSE.md d3ebd190e9 License %!s(int64=6) %!d(string=hai) anos
README.md 6fff1d4cfa No Windows %!s(int64=4) %!d(string=hai) anos
anonymize.c 42ffa3162b Slightly better %!s(int64=3) %!d(string=hai) anos
comments.c 2e81fd33be Additional error checking %!s(int64=4) %!d(string=hai) anos
comments.h 339f697a37 Keep it simple %!s(int64=6) %!d(string=hai) anos
makefile 611d4e01fd Correction %!s(int64=3) %!d(string=hai) anos
stopif.h 45bdd3d8ca Formatting %!s(int64=6) %!d(string=hai) anos
test.docx b13896656d Better version %!s(int64=7) %!d(string=hai) anos
xmlbuff.c 224dfee436 Updated and improved %!s(int64=3) %!d(string=hai) anos
xmlbuff.h 224dfee436 Updated and improved %!s(int64=3) %!d(string=hai) anos
zip.c 4834f86505 Fixed memory leak %!s(int64=3) %!d(string=hai) anos
zip.h 339f697a37 Keep it simple %!s(int64=6) %!d(string=hai) anos

README.md

Anonymize DOCX Comments

While doing review in Word documents translators/reviewers often use tracked changes and comments to exchange feedback on translations. Usually these people are from different organizations and shouldn't know about each other. Hence the need to anonymize comments and this is what this tool will do for you.

It'll go through comments in "word/comments.xml" and change each author's name to Authornumber, where number starts from 1. It'll keep track of authors so "John Smith" will always be "Author1" for instance. After it's done it'll print list of authors and their new names.

Usage:

./anonymize test.docx - test.docx will be replaced with anonymized version.

./anonymize test.docx test2.docx - anonymized version will be saved as test2.docx leaving original test.docx intact.

Running it on provided test.docx should produce:

"King, Stephen" is now "Author1"
"Kowalski, Jan" is now "Author2"
"Piotr Fronczewski" is now "Author3"

File called test.docx.bin, or test2.docx.bin, will be created containing details of the transformation.

You can also de-anonymize comments. Proper bin file, named ".bin", must be present.

./anonymize test.docx -d - test.docx will be replaced with de-anonymized version.

./anonymize test.docx -d test2.docx - de-anonymized version will be saved as test2.docx leaving original test.docx intact.

You'll need libarchive, libxml2 and lbinn to compile it. It was created as learning project while I was exploring C, so use it freely, but at your own risk. Output was tested with Word 2013 and Libre Office Writer. Enjoy!