Search Mailing List Archives
[liberationtech] Metadata Cleanup trough File Format Convertion?
griffinboyce at gmail.com
Wed Jul 17 14:40:57 PDT 2013
Fabio Pietrosanti (naif) <lists at infosecurity.ch> wrote:
> Hi all,
> i've been thinking about the topic of metadata cleanup of files from an
> implementation point of view.
Media metadata is incredibly fascinating :D Obscuracam does a really
great job of cleaning up jpegs, but doesn't cover the other random picture
types that people tend to have around.
I've been mulling around the idea of a bash/python/etc script that could
be run an an entire folder of random stuff and remove all the metadata.
This is one of those things that seems really easy conceptually, but has
really stumped me in practice. There's so many different types of
"metadata" that it's tricky to plot out a work plan to do it. In any given
folder there might be Microsoft Word docs (with full revision history that
can reveal individuals' full names), photos (with personal exif/gps data),
html files (marked with the source of the file)....
PDFs are an interesting situation, because they have metadata, and the
files within have metadata, and even embedded fonts can have metadata that
could reveal the source of the document. This should still be the case
when exporting/converting from ODF/DOC to PDF (unless everything goes
through some type of cleaning process beforehand, before the original
document were created). Depending on the document, this could be a good
thing. It might be possible to prove that the origin of Evidence X was
from Corrupt CEO Y using metadata. By the same token, it's just as likely
to prove that Leak A came from Intel Analyst B. Even the NSA's weirded out
about it .
Just another hacker in the City of Spies.
My posts, while frequently amusing, are not representative of the thoughts
of my employer.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the liberationtech