Automate Cropping of Photos from Scanned Album Pages

How do you easily share memories captured in photo albums going back generations? Simple Scan is really simple. It scans a page but extracting photos from the scanned image is a very tedious task. So, this month's exploration is that how do you automate it to reduce the drudgery. It is clearly an example of the xkcd Automation (http://xkcd.com/1319/)! However, it also falls into the category of spending time on “What Doesn't Seem Like Work?” (http://www.paulgraham.com/work.html).

There is a wonderful script 'multicrop' (http://www.fmwconcepts.com/imagemagick/multicrop/), which uses Imagemagick tools, to crop and straighten images.

Extending the concept behind multicrop

The basic logic of Multicrop script is as follows:

  • fuzzy replace the background color by none and the rest by red

  • actual images will be islands of red surrounded by none

  • extract each red island and bound it in a rectangle

  • use the rectangle as a mask on the the original image, extract the photo.

A fuzz factor is used to select the background color. If the value is too high, part of the photo may be lost in the background and the photo may be split in mutiple parts. If the value is too low, photos may not be extracted. However, even if a part of the photo is treated as background, as long as the enclosing rectangle is the size of the original photo, you don't have to worry.

The script worked very well with multiple loose photos scanned at a time as long as there was some gap between the photos and on the boundaries.

My problem was that the photos could not be removed from the album without damage. Furthermore, the background was not uniform. The background consists of multiple colors.

Hence, in the above logic, I decide to change the first step by three:

  • select a set of colours from the border

  • replaced each colour by none

  • replace what is left of the image by red

It helped reduce to the drudgery though definitely did not save any time!

Imagemagick steps


Fig 1: The Starting Image

Fig 2: After removing the background and replacing the remaining image by red

Fig 3: Mask for extracting a photo

The following steps have been adapted from the 'multicrop' script referenced above, though I converted the steps into a Python script using os.system, subprocess.call and subprocess.check_output methods. For more details about the convert options, see http://www.imagemagick.org/script/command-line-options.php. Sample values are used where needed to simplify the examples.

Convert the image file (Fig 1) into Imagemagick's internal mpc format and use the mpc format for the intermediate steps for efficient processing:

convert image.jpg +repage out.mpc

For each background colour - bgcolor as an (r,g,b) tuple, rename out.mpc and out.cache to in.mpc and in.cache and floodfill none replacing the background colour. A 1x1 border of the background colour is added to ensure floodfilling is from all sides of the image and it is then shaved off.

rename out in out.*

convert in.mpc -fuzz 6% -fill none \

-bordercolor srgb+str(bgcolor) -border 1x1 \

-draw matte 0,0 floodfill -shave 1x1 out.mpc

The next step is to remove the remaining part of the image by red. You will get an image similar to Fig 2.

convert out.mpc -fuzz 6% -fill red +opaque none \

-background black -alpha background TMP2.mpc

You now need to find a cluster of red pixels. Since your photo will not be very small, rather than searching pixel by pixel, you can speed the process by a factor of 100 by searching every 10th pixel in each row and column.

To get the colour at pixel (x,y):

color = `convert TMP2.mpc -channel rgba -alpha on -format “%[pixel:u.p{x,y}” info:`

If the color is not none but red, replace the contiguous red pixels by white:

convert TMP2.mpc -channel rgba -alpha on -fill white \

-draw “color x,y floodfill” TMP3.mpc

Now, you want to get only the white part. So, fill all pixels that are not white with transparecncy and then turn transparency off so that all that is not white becomes black.

convert TMP3.mpc -channel rgba -alpha on -fill none +opaque white -alpha off TMP3A.mpc

The white part is not a rectangle. So, clone the image and trim it so that it bounds the white part. Now, replace all black by white in this trimmed image.

convert TMP3A.mpc -trim -fill white -opaque black TMP3B.mpc

Now, flatten the second image on top of the previous one to get the mask for a photo (see Fig 3).

convert TMP3A.mpc TMP3B.mpc -flatten TMP4.mpc

The above steps can be combined into a single convert command as follows:

convert \( TMP3.mpc -channel rgba -alpha on -fill none +opaque white -alpha off \) \

\( +clone -trim -fill white -opaque black \) \

-flatten TMP4.mpc

The photo can now be extracted:

convert image.jpg tmp4.mpc -compose multiply -composite -trim photo-1.jpg

While extracting, you may want to add the logic to straighten the image as well. So, instead, use the following command:

convert image.jpg tmp4.mpc -compose multiply -fuzz 6% -composite -trim \

-deskew 40% -trim +repage photo-1.jpg

The multicrop script adds a border as well for a better presentation.

Now, you need to remove the white image area so that it is not used again.

convert TMP3.mpc -channel rgba -alpha on -fill none -opaque white TMP2.mpc

You are now ready to find another red pixel in TMP2.mpc and extract the next photo.

Usually, you will want to discard small photos as there may be spurious small islands of red. At times, you may find that the extracted image is smaller, e.g. if the sky is light, it may be mistaken for the background. So, there is considerable scope for making the script a lot smarter!

Tailpiece: Improve a scanned text page

Scanning a text page to be never easy with Simple Scan., especially with old documents. Using the text mode, some folds show up as lines. If the text is faded or shaded, parts of the characters are missing. While the visual result of scanning in photo mode is much better, a printout normally has a distracting gray background and readability is often lost in the process.

A solution is to use the white-threshold option in Imagemagick's convert command line utility after scanning a text document as a photo, e.g.

$ convert scanned_text.jpg -colorspace gray -white-threshold 60% printable.jpg
Comments