Sandbox‎ > ‎Archive‎ > ‎IPT 2008-09‎ > ‎Blair's page‎ > ‎Major Project Register‎ > ‎

2009-04-15 Mass-processing

posted Apr 14, 2009, 6:46 PM by Unknown user   [ updated Jun 27, 2009, 11:25 PM by Eddie Woo ]

1. Past papers - error fixing

I have been busy over the past few days processing past papers, starting from year 7 upwards. By "processing", I am referring to the same process that is described in the IPT syllabus - modifying the "original data" to correct errors, such as this one:



These errors can be fixed by loading the original file in Microsoft Office 2007 manually and then printing using the PDF plugin, instead of using Adobe Acrobat to do it automatically. This method has the drawbacks of being more time consuming and results in larger files (as would be expected!), but has the extremely valuable benefit of allowing all questions to actually be able to be read.

2. Past papers - ensuring PDF/A compliancy

When you're distributing documents digitally to a huge audience, you would be inclined to ensure that everyone is able to open the files. Adobe has a rather annoying habit of introducing new file formats with every subsequent version of Adobe Acrobat, which means that the files created in a new version of the software may not be able to be opened in a old version of the software.

PDF/A files are created such that they are independent of other data sources (e.g. Microsoft Equation Editor), so that someone without those data sources (e.g. me and my version of Office that doesn't have Microsoft Equation Editor) is able to read the file. Of course, this means that all the data dependencies need to be embedded in the file itself, rather than linked to, resulting in larger files.

I am currently in the process of checking every single file and saving them as PDF/A compliant documents.

By storing files in PDF/A, I should be able to reduce the likelihood of errors like the one described graphically in the images above do not occur in the PDF files when opened under different setups (note: the error above was not caused by the resulting PDF file but by the process by which they were converted). More importantly, storing as PDF/A should also ensure that older versions of Adobe Acrobat are able to open the files.

3. Past papers - naming convention

As I will need to check and update every file, I figured that this would be a good time to change the naming convention to make more sense. Previously, files had the naming convention as seen in yr10_2002_hy.pdf - however, this orders by grade, year, and then paper. Naturally, it would make more sense to order by grade, paper, and then year - yr10_hy_2002.pdf - which is how it will be after I finish this very long process.