Spin Text For SEO - A PHP Spinner

Update: A new version of this code is available on the Updated PHP Spinner page.

I recently had an SEO expert give us a few hours of his time to provide some suggestions to compliment our SEO strategy. One of the techniques that he introduced I was so impressed with (due to it's utter simplicity) that I am kicking myself for not thinking of it before! Basically, if you have 'doorway' pages into your site (e.g. you have pages for 'Cambridge Widgets' and 'Preston Widgets' and alike) and want them to be dynamically generated, from the same content, but not to suffer horrible duplicate content penalties, you can use a 'spinning' function to generate contextually similar, different, content. The original concept is here, but I have enhanced it a bit to better suit my needs.

The original function had a few issues which made it unsuitable for my use, but a train journey home after our meeting provided me with ample time to enhance it for my needs. Basically I needed it to be:

None of these were difficult to add and I eventually ended up with two functions (the spin function and one to replace only the first instance of a string) for the job:

function spin($string, $seedPageName = true, $calculate = false, $openingConstruct = '{{', $closingConstruct = '}}', $separator = '|')
{
	# Choose whether to return the string or the number of permutations
	$return = 'string';
	if($calculate)
	{
		$permutations	= 1;
		$return			= 'permutations';
	}
 
	# If we have nothing to spin just exit (don't use a regexp)
	if(strpos($string, $openingConstruct) === false)
	{
		return $$return;
	}
 
	if(preg_match_all('!'.$openingConstruct.'(.*?)'.$closingConstruct.'!s', $string, $matches))
	{
		# Optional, always show a particular combination on the page
		if($seedPageName)
		{
			mt_srand(crc32($_SERVER['REQUEST_URI']));
		}
 
		$find 		= array();
		$replace 	= array();
 
		foreach($matches[0] as $key => $match)
		{
			$choices = explode($separator, $matches[1][$key]);
 
			if($calculate)
			{
				$permutations *= count($choices);
			}
			else
			{
				$find[] 	= $match;
				$replace[] 	= $choices[mt_rand(0, count($choices) - 1)];
			}
		}
 
		if(!$calculate)
		{
			# Ensure multiple instances of the same spinning combinations will spin differently
			$string = str_replace_first($find, $replace, $string);
		}
	}
 
	return $$return;
}
 
# Similar to str_replace, but only replaces the first instance of the needle
function str_replace_first($find, $replace, $string)
{
	# Ensure we are dealing with arrays
	if(!is_array($find))
	{
		$find = array($find);
	}
 
	if(!is_array($replace))
	{
		$replace = array($replace);
	}
 
	foreach($find as $key => $value)
	{
		if(($pos = mb_strpos($string, $value)) !== false)
		{
			# If we have no replacement make it empty
			if(!isset($replace[$key]))
			{
				$replace[$key] = '';
			}
 
			$string = mb_substr($string, 0, $pos).$replace[$key].mb_substr($string, $pos + mb_strlen($value));
		}
	}
 
	return $string;
}

And an example:

$string = '{{The|A}} {{quick|speedy|fast}} {{brown|black|red}} {{fox|wolf}} {{jumped|bounded|hopped|skipped}} over the {{lazy|tired}} {{dog|hound}}';
 
echo '<p><b>'.spin($string, false, true).'</b> permutations...</p><p>';
 
for($i = 1; $i <= 5; $i++)
{
	echo spin($string, false).'<br />';
}
 
echo '</p>';

Which produces:

576 permutations...

The speedy black wolf bounded over the lazy hound
A speedy brown fox skipped over the tired hound
The quick red wolf bounded over the lazy hound
The fast brown fox hopped over the tired dog
The speedy brown fox jumped over the tired dog

I'm sure that it isn't perfect, but perhaps it will provide inspiration to someone else, like it did for me!

Update

Due to a request for nested brackets I have produced another (very different) version of this system that allows spin block nesting. It's a bit rough and ready (because I have limited time at the minute) and requires an extra function to run (strpos_all). The cost of this enhancement is that it no longer calculates permutations (because that is pretty difficult and I'm lazy!), but I'm sure it's possible if you really want to add it... Just remember when using this that nested spin blocks provide far less permutations than separate ones!

function spin($string, $seedPageName = true, $openingConstruct = '{{', $closingConstruct = '}}', $separator = '|')
{
	# If we have nothing to spin just exit
	if(strpos($string, $openingConstruct) === false)
	{
		return $string;
	}
 
	# Find all positions of the starting and opening braces
	$startPositions	= strpos_all($string, $openingConstruct);
	$endPositions	= strpos_all($string, $closingConstruct);
 
	# There must be the same number of opening constructs to closing ones
	if($startPositions === false OR count($startPositions) !== count($endPositions))
	{
		return $string;
	}
 
	# Optional, always show a particular combination on the page
	if($seedPageName)
	{
		mt_srand(crc32($_SERVER['REQUEST_URI']));
	}
 
	# Might as well calculate these once
	$openingConstructLength = mb_strlen($openingConstruct);
	$closingConstructLength = mb_strlen($closingConstruct);
 
	# Organise the starting and opening values into a simple array showing orders
	foreach($startPositions as $pos)
	{
		$order[$pos] = 'open';
	}
	foreach($endPositions as $pos)
	{
		$order[$pos] = 'close';
	}
	ksort($order);
 
	# Go through the positions to get the depths
	$depth = 0;
	$chunk = 0;
	foreach($order as $position => $state)
	{
		if($state == 'open')
		{
			$depth++;
			$history[] = $position;
		}
		else
		{
			$lastPosition	= end($history);
			$lastKey		= key($history);
			unset($history[$lastKey]);
 
			$store[$depth][] = mb_substr($string, $lastPosition + $openingConstructLength, $position - $lastPosition - $closingConstructLength);
			$depth--;
		}
	}
	krsort($store);
 
	# Remove the old array and make sure we know what the original state of the top level spin blocks was
	unset($order);
	$original = $store[1];
 
	# Move through all elements and spin them
	foreach($store as $depth => $values)
	{
		foreach($values as $key => $spin)
		{
			# Get the choices
			$choices = explode($separator, $store[$depth][$key]);
			$replace = $choices[mt_rand(0, count($choices) - 1)];
 
			# Move down to the lower levels
			$level = $depth;
			while($level > 0)
			{
				foreach($store[$level] as $k => $v)
				{
					$find = $openingConstruct.$store[$depth][$key].$closingConstruct;
					if($level == 1 AND $depth == 1)
					{
						$find = $store[$depth][$key];
					}
					$store[$level][$k] = str_replace_first($find, $replace, $store[$level][$k]);
				}
				$level--;
			}
		}
	}
 
	# Put the very lowest level back into the original string
	foreach($original as $key => $value)
	{
		$string = str_replace_first($openingConstruct.$value.$closingConstruct, $store[1][$key], $string);
	}
 
	return $string;
}
 
# Similar to str_replace, but only replaces the first instance of the needle
function str_replace_first($find, $replace, $string)
{
	# Ensure we are dealing with arrays
	if(!is_array($find))
	{
		$find = array($find);
	}
 
	if(!is_array($replace))
	{
		$replace = array($replace);
	}
 
	foreach($find as $key => $value)
	{
		if(!empty($value))
		{
			if(($pos = mb_strpos($string, $value)) !== false)
			{
				# If we have no replacement make it empty
				if(!isset($replace[$key]))
				{
					$replace[$key] = '';
				}
 
				$string = mb_substr($string, 0, $pos).$replace[$key].mb_substr($string, $pos + mb_strlen($value));
			}
		}
	}
 
	return $string;
}
 
# Finds all instances of a needle in the haystack and returns the array
function strpos_all($haystack, $needle)
{
	$offset = 0;
	$i		= 0;
	$return = false;
 
	while(is_integer($i))
	{   
		$i = mb_strpos($haystack, $needle, $offset);
 
		if(is_integer($i))
		{
			$return[]	= $i;
			$offset		= $i + mb_strlen($needle);
		}
	}
 
	return $return;
}

And an example:

$string = '{{A {{simple|basic}} example|An uncomplicated scenario|The {{simplest|trivial|fundamental|rudimentary}} case|My {{test|invest{{igative|igation}}}} case}} to illustrate the {{function|problem}}';
 
echo '<p>';
 
for($i = 1; $i <= 5; $i++)
{
	echo spin($string, false).'<br />';
}
 
echo '</p>';

Which produces:

A basic example to illustrate the function
My test case to illustrate the problem
The rudimentary case to illustrate the function
An uncomplicated scenario to illustrate the problem
The fundamental case to illustrate the problem

As before I'm sure that this one isn't perfect, but perhaps it will provide inspiration to someone else. All corrections / comments are welcome.