How to tell Ads from Photos and other Images

When I was a child my great Aunt Cornelia gave me a copy of Robert Williams Wood's How to tell the Birds from the Flowers: And Other Woodcuts. The joke behind every poem and illustration is the same - you can't. Check out Amazon's cover shot to see that a Pansy looks very much like a Chim-pansy, for example. In writing the package of JavaScripts that parses the InDesign files for an entire issue of the Boston Review into the files needed to upload all of the issue's content into the new Drupal-based web site I am building for them, I was faced with a very similar question: How do you tell an ad from a photo?.

Although I still can't tell a bird from a flower, I'm glad to report that I have found a trick that allows me (and my code) to tell an ad from a photo every time -- at least in the Boston Review's current format.

Here's the basis of the trick: ads are not attributed and photos and other images are attributed, every time. So if one's code can programmatically detect the attribution, it can tell the difference between an ad and a photo or other image, every time. If it has an attribution, it is a photo or other image, and if it does not have an attribution it is an ad.

In the current format of the Boston Review an attribution is given as a sideways piece of text next to the photo that reads from bottom to top. In InDesign, it's implemented as a very tall and thin TextFrame, the height holding the length of the attribution and the width holding the text's height. So my idea is to go through all the PageItems on the same layout page as the image in question and look for a tall thin TextFrame next to the image.

So here is the code for my checkIfAd() routine, that identifies Boston Review ads, every time. That way my code can skip the ads on import. A very useful trick.

// checkIfAd() <br />
// Given all the pageitems on a page and the item index of a graphic, checks if there <br />
// are any tall and skinny TextFrames on its right side.  If so, it is judged not <br />
// to be an ad (since it has an attribution) and if no attribution TextFrame is found <br />
// it is judged to be an ad <br />  
function checkIfAd(pageItems, itemIndex) { <br />
	t = pageItems[itemIndex].geometricBounds[0]; <br />
	l = pageItems[itemIndex].geometricBounds[1]; <br />
	b = pageItems[itemIndex].geometricBounds[2]; <br />
	r = pageItems[itemIndex].geometricBounds[3]; <br />
	isAttribution = false; <br />
	var i; <br />
	for (i=0; i<pageItems.length; i++) { <br />
		if (i != itemIndex ) { <br />
			if (checkIfAttribution( pageItems[i])) { <br />
				pi_t = pageItems[i].geometricBounds[0]; <br />
				pi_l = pageItems[i].geometricBounds[1]; <br />
				pi_b = pageItems[i].geometricBounds[2]; <br />
				pi_r = pageItems[i].geometricBounds[3]; <br />
				if (pi_t>t && pi_b<=1.1*b && (pi_r-r)/(r-l)<0.1) { <br />
					isAttribution = true; <br />
				} <br />
			} <br />
		} <br />
	} <br />
	return !isAttribution; <br />
} <br />

