Search Mailing List Archives
[liberationtech] [open-science] Removing watermarks from pdfs (pdfparanoia)
pm286 at cam.ac.uk
Tue Feb 5 13:47:34 PST 2013
On Tue, Feb 5, 2013 at 9:15 PM, Bryan Bishop <kanzure at gmail.com> wrote:
> On Tue, Feb 5, 2013 at 3:09 PM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
>> PDF2SVG should be able to do this (http://bitbucket.org/petermr/pdf2svg).
>> It should also remove the side annotations about which library the PDF was
>> downloaded from. Send me one and I'll see.
> Is there a svg2pdf? The problem with using pdfquery is that it can only
> generate an xml format, and at first it looks like pdfxml, except Adobe
> came up with a "standard" called pdfxml that looks completely different. So
> getting things back into pdf seems to be difficult.
I use Apache FOP. We should be able to:
* read PDF into SVG
* remove the rubbish
* write the primitives back into PDF. We might get font problems so you may
have to make do with PDF/ISO standard 14 fonts. That might screw some of
the microkerning occasionally. If you want to reformat running text and
lose the publishers layout (e.g. 2-col => 1-col then we will use SVGPlus.
Some of this is alpha, not production.
> - Bryan
> 1 512 203 0507
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the liberationtech