Skip to main content

PDF manipulations

2015-05-11: Warning.
Cannot be used unchanged with Python3. It is recommended to install pdf, argparse and pyPDF with pip install (set paths in Windows to both Python27 folder and Python/Scripts folder)

As a teacher in Advanced Engineering Mathematics I grade a lot of home work. The students upload Maple output pdf files to the campus website and I access the files in a class/student_id folder structure obtained from a zip file. I move the pdf files to the top-layer and merge them using the shell and python scripts below. Once done I can easily print out files with four pages on each sheet to reduce waste and optimize the process of commenting and grading.

Utilities: Mingw, pdfmerge.py (modification of script found on the internet) and pyPDF:


#!/usr/bin/env python
# -*- coding: utf-8 -*-

#In pyPDF folder (find the package on the internet)
#python setup.py install


#Example - first oneline moves pdf files the second append files and add blanks as necessary
#mv --backup=numbered **/*.pdf .
#python ../Scripts/pdfmerge.py -p=. -o=output.pdf -b=../Scripts/blank-page.pdf



from argparse import ArgumentParser
from glob import glob
from pyPdf import PdfFileReader, PdfFileWriter

def merge(path, blank_filename, output_filename):
    blank = PdfFileReader(file(blank_filename, "rb"))
    output = PdfFileWriter()

    for pdffile in glob('*.pdf'):
        if pdffile == output_filename:
            continue
        print("Parse '%s'" % pdffile)
        document = PdfFileReader(open(pdffile, 'rb'))
        for i in range(document.getNumPages()):
            output.addPage(document.getPage(i))
        i=((4-document.getNumPages()) % 4)
        while i>0:
            output.addPage(blank.getPage(0))
            print("Add blank page to '%s' (had %i pages)" % (pdffile, document.getNumPages()))
   i=i-1
    print("Start writing '%s'" % output_filename)
    output_stream = file(output_filename, "wb")
    output.write(output_stream)
    output_stream.close()

if __name__ == "__main__":
    parser = ArgumentParser()

    # Add more options if you like
    parser.add_argument("-o", "--output", dest="output_filename", default="merged.pdf",
                      help="write merged PDF to FILE", metavar="FILE")

    parser.add_argument("-b", "--blank", dest="blank_filename", default="blank.pdf",
                      help="path to blank PDF file", metavar="FILE")

    parser.add_argument("-p", "--path", dest="path", default=".",
                      help="path of source PDF files")


    args = parser.parse_args()
    merge(args.path, args.blank_filename, args.output_filename)

Comments

Popular posts from this blog

HackRF on Windows 8

This technical note is based on an extract from thread. I have made several changes and added recommendations. I have experienced lot of latency using GnuRadio and HackRF on Pentoo Linux, so I wanted to try out GnuRadio on Windows.



HackRF One is a transceiver, so besides SDR capabilities, it can also transmit signals, inkluding sweeping a given range, uniform and Gaussian signals. Pentoo Linux provides the most direct access to HackRF and toolboxes. Install Pentoo Linux on a separate drive, then you can use osmocom_siggen from a terminal to transmit signals such as near-field GSM bursts, which will only be detectable within a meter.









Installation of MGWin and cmake: Download and install the following packages:
- MinGW Setup (Go to the Installer directory and download setup file)
- CMake (I am using CMake 3.2.2 and I installed it in C:\CMake, this path is important in the commands we must send in the MinGW shell)
Download and extract the packages respectively in the path C:\MinGW\msys\…

Example: Beeswarm plot in R

library(foreign)

data <- read.dta("C:/Users/hellmund/Documents/MyStataDataFile.dta")

names(data)

install.packages('beeswarm')

library(beeswarm)

levels(data$group)

png(file="C:/Users/hellmund/Documents/il6.png", bg="transparent")

beeswarm(data$il6~data$group,data=data, method=c("swarm"),pch=16,pwcol=data$Gender,xlab='',ylab='il6',ylim=c(0,20))

legend('topright',legend=levels(data$Gender),title='Gender',pch=16,col=2:1)

boxplot(data$il6~data$group, data=data, add = T, names = c("","",""), col="#0000ff22")

dev.off()

Example: Business cards typeset with LaTeX

So you enjoy the quality of a professional typesetting system? You got Avery labels, a working MikTeX and the ticket package installed...
You might find some assistance from a half criminal paranoid zealot system administrator, willing to guide you through a dinosaur kingdom of TeX ... but that kind of assistance might also just leave you with nothing.

It was easy to get the layout of the labels with the option zw32010, but how about page margins? I tried to set things straight with the layouts package (\usepackage{layouts}\currentpage \pagedesign), but then there was still some unwanted white space and margins...

To make things less complicated I decided to make a single card. The solution is a hack because it needs customization (with voffset and hoffset as you see n the TeX code below) but the adjustment is more straightforward, especially if you use the boxed option with ticket.

The card was converted to png with Ghostscript and I could easily print the business cards with Averys …