Wednesday, April 14, 2010

Webpage screen capturing using khtml2png

Recently, we were working on one PHP project, where we required to have "webpage screen capturing" functionality. I googled on net and found some tools... some window based, some paid... obviously I was looking for *FREE* tool :). As we are working on Lamp (Linux, Apache, MySQL & PHP), I was wondering if I get some Linux based tool.

One solution I found was, using html2ps and then ps2png/ps2jpg/ps2gif to convert it to image. Then ImageMagicK for image manipulation. Somehow I stuck with some weird memory related errors, some package conflicts, some formatting issues etc. So, after spending one day on nothing; I dropped this solution.

Then I tried khtml2png (http://khtml2png.sourceforge.net) and after some r&d, it worked for us.

Some points to remember...
- You need to have VPS/dedicated hosting to setup these tools. On shared hosting, its not possible to install due to various restrictions by hosting providers.

- This tool requires, some libraries and tools: g++, KDE 3.x, kdelibs for KDE 3.x, zlib (zlib1g-dev) and cmake

- This tool uses KDE (K Desktop Environment), that means whenever you use khtml2png tool, it will open one window for *a while* at time of capturing webpage screenshot. We can remove this by using "Xvfb". We will see how to install and configure it later.

- These links will be helpful, if you are planning to develop web application with webpage screen capturing using khtml2png
http://khtml2png.sourceforge.net/index.php?page=faq
http://www.mysql-apache-php.com/website_screenshot.htm

Here is step by step guide to install various dependencies and packages. (I installed these tools on Fedora7 & RHEL5 successfully)

I used "yum" command to install and auto-configure these tools. If "yum" is not available on your machine, get if from http://yum.baseurl.org/ and install it.

Step:1

yum install ImageMagick

yum install Xvfb

yum install gcc gcc-c++ automake autoconf nano zlib zlib-devel

yum groupinstall "X Window System" "KDE (K Desktop Environment)"

yum install kdelibs kdelibs-devel

yum install Xvfb xorg xorg-x11-font*

Step:2 Install *cmake*
Go to share directory by typing command
cd /usr/local/share/
or any preferred directory where you want to download package. (check http://www.cmake.org for latest "cmake" version)

wget http://www.cmake.org/files/v2.8/cmake-2.8.1.tar.gz

tar -xzvf cmake-2.8.1.tar.gz

cd cmake-2.8.1

./bootstrap

make

make install

Step:3 Download & Install *khtml2png* on your server as per instructions in this link.
http://khtml2png.sourceforge.net/index.php?page=download

Step:4 Check if *khtml2png* is working

/usr/local/bin/khtml2png2 'http://www.yahoo.com' yahoo.png

(this will capture yahoo homepage in yahoo.png)

Step:5 Install *khtmld* (a daemon which will be required to run khtml2png in background)
http://wiki.goatpr0n.de/projects/khtmld

I faced couple of problems while setting up *khtmld*, but it got solved by reading suggestions from above link.

I installed above all tools as *root* user.

Once you are done with above steps, lets play with *khtml2png*

How to start?
Run following command to run khtml2png without a visible X session

Xvfb :2 -screen 0 1024x768x24&
export DISPLAY=localhost:2.0
(you can put above 2 lines in rc.local so it will start automatically whenever server restarts)

Then start *khtmld* daemon as your webserver user (for me it is *apache*) so that PHP script can have permission to talk with this daemon. (run below command after login as *root* user)

khtmld -K /usr/local/bin/khtml2png2 -c /etc/khtmldrc --user apache&

"-K /usr/local/bin/khtml2png2" is path to khtml2png2 as by default "khtmld" will look for old "khtml2png" (khtml2png2 is latest version). Find khtml2png2 path using

whereis khtml2png2

"-c /etc/khtmldrc" is config file path for khtmld (you can create this config file if its not already there)
Sample content for khtmldrc

width=1024
height=768
display=:0.0

Capture image using *khtmld*

echo "http://www.yahoo.com /tmp/yahoo.png" >/tmp/khtmldspool
(for more details - http://wiki.goatpr0n.de/projects/khtmld)

We have also used ImageMagicK command "convert" (http://www.imagemagick.org/script/convert.php) to trim the image for removing whitespace.

convert /tmp/yahoo.png -fuzz 1% -trim /tmp/new.yahoo.png

Sample PHP code for capturing & displaying PNG image using "khtml2png"

<?php
ob_clean();
header("Cache-Control: no-cache");
header("Pragma: no-cache");
header("Content-type: image/png");

$webpage_url= "http://www.yahoo.com";

$out_put_file = "/tmp/yahoo.png"; //captured screen
$new_out_put_file = "/tmp/new.yahoo.png"; //whitespace removed

$cmd = "echo '".$webpage_url." ".$out_put_file."' >/tmp/khtmldspool";
exec("$cmd");

// some delay till khtml2png capture screen
while(!file_exists($out_put_file)) { sleep(3); }

exec("convert $out_put_file -fuzz 1% -trim $new_out_put_file");

while(!file_exists($new_out_put_file)) { sleep(1); }

// display image on browser
echo file_get_contents($new_out_put_file);

unlink($out_put_file);
unlink($new_out_put_file);
exit;
?>


Hope this will be helpful.

That's all for now.

0 comments:

Post a Comment