Render PDF Files With HTML5 and JavaScript Using PDF.JS
Altough this project’s still far from production status it’s really cool and I thought I would take a quick look at it. I’ve tried it in Chrome and Firefox 5 and it works quite well with a few rendering issues though.
So far there’s not much information about the project out there, but Andreas Gal, one of the creators has a blog post about it. You can check it out here, http://andreasgal.com/2011/06/15/pdf-js/.
There’s also a demo here http://people.mozilla.org/~gal/test.html.
I’ve also put together a little demo to play with it myself, it’s mostly copy and past from the official demo but I thought I should give you a quick walk through. So let’s get started.
Step 1. The HTML page
Start by creating a blank HTML page and a blank CSS file, you can leave the CSS file empty but for some reason the PDF library requires you to have a CSS file linked to your page, otherwise it will crash. Download the example from the bottom of this page and copy the lib folder to where you’ve created your HTML file, the lib folder contains the required JavaScript files for PDF.JS to work.
In the HTML below we start of by adding our CSS file. Then we add the test.js script which will contain our code to load the PDF file. In the onload event on our body tag we call the load function, which will be found in test.js, with the URL to our PDF file as a parameter. Next we add a pager and a canvas which PDF.JS will draw the PDF to.
<!DOCTYPE html>
<html>
<head>
<title>PDF Viewer using PDF.JS and HTML5</title>
<link rel="stylesheet" href="test.css"></link>
<script src="test.js"></script>
<script src="lib/pdf.js"></script>
<script src="lib/fonts.js"></script>
<script src="lib/cffStandardStrings.js"></script>
<script src="lib/Encodings.js"></script>
<script src="lib/glyphlist.js"></script>
</head>
<body onload="load('compressed.tracemonkey-pldi-09.pdf')">
Page: <span id="pageNumber"></span> of <span id="numPages"></span>
<a href="javascript:prevPage()">Previous</a> |
<a href="javascript:nextPage()">Next</a>
<!-- Canvas dimensions must be specified in CSS pixels. CSS pixels
are always 96 dpi. 816x1056 is 8.5x11in at 96dpi. -->
<!-- We're rendering here at 1x scale. -->
<canvas id="canvas" width="816" height="1056"></canvas>
</body>
</html>
Step 2. The test.js JavaScript
Create the test.js file in the same folder as your HTML file. Open it and start by adding the following code. The global variables will contain references to the PDF document, our canvas, current page number, an interval for rendering fonts and the total number of pages in our document.
The load function creates a reference to canvas, resets the page number and calls the open function to load the PDF.
var pdfDocument, canvas, pageNum, pageInterval, numPages;
function load(userInput) {
canvas = document.getElementById("canvas");
canvas.mozOpaque = true;
pageNum = 1;
fileName = userInput;
open(fileName);
}
Next we call the open function, this downloads the PDF using an AJAX request and when the PDF has been loaded we call the PDF.JS library to create a PDFDoc object. Then we display the total number of pages and calls the displayPage to display the first page.
function open(url) {
req = new XMLHttpRequest();
req.open("GET", url);
req.mozResponseType = req.responseType = "arraybuffer";
req.expected = (document.URL.indexOf("file:") == 0) ? 0 : 200;
req.onreadystatechange = function() {
if (req.readyState == 4 && req.status == req.expected) {
var data = req.mozResponseArrayBuffer || req.mozResponse ||
req.responseArrayBuffer || req.response;
pdfDocument = new PDFDoc(new Stream(data));
numPages = pdfDocument.numPages;
document.getElementById("numPages").innerHTML = numPages.toString();
displayPage(pageNum);
}
};
req.send(null);
}
Now it’s time to add the displayPage function, it takes care of the rendering of a specified page in the PDF document to our canvas. It loads the fonts, set the current page number and calls the PDF.JS library to do the rendering.
Due to the lack of documentation I can’t really explain everything that’s going on in this function. It’s a straight copy and paste from the example I’ve linked to above.
function displayPage(num) {
if (pageNum != num)
window.clearTimeout(pageInterval);
document.getElementById("pageNumber").innerHTML = num;
var page = pdfDocument.getPage(pageNum = num);
var ctx = canvas.getContext("2d");
ctx.save();
ctx.fillStyle = "rgb(255, 255, 255)";
ctx.fillRect(0, 0, canvas.width, canvas.height);
ctx.restore();
var gfx = new CanvasGraphics(ctx);
// page.compile will collect all fonts for us, once we have loaded them
// we can trigger the actual page rendering with page.display
var fonts = [];
page.compile(gfx, fonts);
var fontsReady = true;
// Inspect fonts and translate the missing one
var count = fonts.length;
for (var i = 0; i < count; i++) {
var font = fonts[i];
if (Fonts[font.name]) {
fontsReady = fontsReady && !Fonts[font.name].loading;
continue;
}
new Font(font.name, font.file, font.properties);
fontsReady = false;
}
function delayLoadFont() {
for (var i = 0; i < count; i++) {
if (Fonts[font.name].loading) {
return;
}
}
clearInterval(pageInterval);
page.display(gfx);
};
if (fontsReady) {
delayLoadFont();
} else {
pageInterval = setInterval(delayLoadFont, 10);
}
}
Lastly we add two simple functions to handle the next and previous page links.
function prevPage() {
if(pageNum > 1) {
displayPage(pageNum - 1);
}
}
function nextPage() {
if(pageNum < numPages) {
displayPage(pageNum + 1);
}
}
That’s it, now you should be able to render PDF’s to your site. You can try out or download the complete demo below.
Demo
Try out the finished example here: Example.
Download
Download the source from here: Download.
it does not translate pdf files with utf8 code you can not help me
when i replace my pdf file with yours [compressed.tracemonkey-pldi-09.pdf], it does not work, can you tell me why please?
thank you very much
Jeff
Is there a chance you can redo the example with the current version of pdfjs?
Hello Nuno, I actually haven’t checked out pdfjs for a while, I know, I’m sorry. But I might take you up on that as it’s a really interesting project.
I tried to dig in the code on the git repo of the project but I just can’t understand it. If you remake this, alert me by email please!
Thanks