Spin Text For SEO – A PHP Spinner


I recently had an SEO expert give us a few hours of his time to provide some suggestions to compliment our SEO strategy. One of the techniques that he introduced I was so impressed with (due to it’s utter simplicity) that I am kicking myself for not thinking of it before! Basically, if you have ‘doorway’ pages into your site (e.g. you have pages for ‘Cambridge Widgets’ and ‘Preston Widgets’ and alike) and want them to be dynamically generated, from the same content, but not to suffer horrible duplicate content penalties, you can use a ’spinning’ function to generate contextually similar, different, content. The original concept is here, but I have enhanced it a bit to better suit my needs.

The original function had a few issues which made it unsuitable for my use, but a train journey home after our meeting provided me with ample time to enhance it for my needs. Basically I needed it to be:

  • either pseudo random or predicable (always the same on a per page basis) without including a block of code outside the function
  • able to include curly braces without spinning – just make the function require two braces, not one (i.e. {{spin|me}} )
  • able to use the same ’spin block’ (i.e. {{phrase 1|phrase 2|phrase 3}}) multiple times, but treat each one differently
  • able to calculate the number of permutations that I was passing in (to ensure I was waaaay over the number of pages I was driving from one set of text)

None of these were difficult to add and I eventually ended up with two functions (the spin function and one to replace only the first instance of a string) for the job:

function spin($string, $seedPageName = true, $calculate = false, $openingConstruct = '{{', $closingConstruct = '}}')
{
    # Choose whether to return the string or the number of permutations
    $return = 'string';
    if($calculate)
    {
        $permutations   = 1;
        $return         = 'permutations';
    }

    # If we have nothing to spin just exit (don't use a regexp)
    if(strpos($string, $openingConstruct) === false)
    {
        return $$return;
    }
   
    if(preg_match_all('!'.$openingConstruct.'(.*?)'.$closingConstruct.'!s', $string, $matches))
    {
        # Optional, always show a particular combination on the page
        if($seedPageName)
        {
            mt_srand(crc32($_SERVER['REQUEST_URI']));
        }

        $find       = array();
        $replace    = array();

        foreach($matches[0] as $key => $match)
        {
            $choices = explode('|', $matches[1][$key]);

            if($calculate)
            {
                $permutations *= count($choices);
            }
            else
            {
                $find[]     = $match;
                $replace[]  = $choices[mt_rand(0, count($choices) - 1)];
            }
        }

        if(!$calculate)
        {
            # Ensure multiple instances of the same spinning combinations will spin differently
            $string = str_replace_first($find, $replace, $string);
        }
    }

    return $$return;
}

# Similar to str_replace, but only replaces the first instance of the needle
function str_replace_first($find, $replace, $string)
{
    # Ensure we are dealing with arrays
    if(!is_array($find))
    {
        $find = array($find);
    }

    if(!is_array($replace))
    {
        $replace = array($replace);
    }

    foreach($find as $key => $value)
    {
        if(($pos = strpos($string, $value)) !== false)
        {
            # If we have no replacement make it empty
            if(!isset($replace[$key]))
            {
                $replace[$key] = '';
            }

            $string = mb_substr($string, 0, $pos).$replace[$key].mb_substr($string, $pos + mb_strlen($value));
        }
    }

    return $string;
}

And an example:

$string = '{{The|A}} {{quick|speedy|fast}} {{brown|black|red}} {{fox|wolf}} {{jumped|bounded|hopped|skipped}} over the {{lazy|tired}} {{dog|hound}}';

echo '<p><b>'.spin($string, false, true).'</b> permutations...</p><p>';

for($i = 1; $i <= 5; $i++)
{
    echo spin($string, false).'<br />';
}

echo '</p>';

Which produces:

576 permutations…

The speedy black wolf bounded over the lazy hound
A speedy brown fox skipped over the tired hound
The quick red wolf bounded over the lazy hound
The fast brown fox hopped over the tired dog
The speedy brown fox jumped over the tired dog

I’m sure that it isn’t perfect, but perhaps it will provide inspiration to someone else, like it did for me!

Update

Due to a request for nested brackets I have produced another (very different) version of this system that allows spin block nesting. It’s a bit rough and ready (because I have limited time at the minute) and requires an extra function to run (strpos_all). The cost of this enhancement is that it no longer calculates permutations (because that is pretty difficult and I’m lazy!), but I’m sure it’s possible if you really want to add it… Just remember when using this that nested spin blocks provide far less permutations than separate ones!

function spin($string, $seedPageName = true, $openingConstruct = '{{', $closingConstruct = '}}')
{
    # If we have nothing to spin just exit
    if(strpos($string, $openingConstruct) === false)
    {
        return $string;
    }

    # Find all positions of the starting and opening braces
    $startPositions = strpos_all($string, $openingConstruct);
    $endPositions   = strpos_all($string, $closingConstruct);

    # There must be the same number of opening constructs to closing ones
    if($startPositions === false OR count($startPositions) !== count($endPositions))
    {
        return $string;
    }

    # Optional, always show a particular combination on the page
    if($seedPageName)
    {
        mt_srand(crc32($_SERVER['REQUEST_URI']));
    }

    # Might as well calculate these once
    $openingConstructLength = mb_strlen($openingConstruct);
    $closingConstructLength = mb_strlen($closingConstruct);

    # Organise the starting and opening values into a simple array showing orders
    foreach($startPositions as $pos)
    {
        $order[$pos] = 'open';
    }
    foreach($endPositions as $pos)
    {
        $order[$pos] = 'close';
    }
    ksort($order);

    # Go through the positions to get the depths
    $depth = 0;
    $chunk = 0;
    foreach($order as $position => $state)
    {
        if($state == 'open')
        {
            $depth++;
            $history[] = $position;
        }
        else
        {
            $lastPosition   = end($history);
            $lastKey        = key($history);
            unset($history[$lastKey]);

            $store[$depth][] = mb_substr($string, $lastPosition + $openingConstructLength, $position - $lastPosition - $closingConstructLength);
            $depth--;
        }
    }
    krsort($store);

    # Remove the old array and make sure we know what the original state of the top level spin blocks was
    unset($order);
    $original = $store[1];

    # Move through all elements and spin them
    foreach($store as $depth => $values)
    {
        foreach($values as $key => $spin)
        {
            # Get the choices
            $choices = explode('|', $store[$depth][$key]);
            $replace = $choices[mt_rand(0, count($choices) - 1)];

            # Move down to the lower levels
            $level = $depth;
            while($level > 0)
            {
                foreach($store[$level] as $k => $v)
                {
                    $find = $openingConstruct.$store[$depth][$key].$closingConstruct;
                    if($level == 1 AND $depth == 1)
                    {
                        $find = $store[$depth][$key];
                    }
                    $store[$level][$k] = str_replace_first($find, $replace, $store[$level][$k]);
                }
                $level--;
            }
        }
    }

    # Put the very lowest level back into the original string
    foreach($original as $key => $value)
    {
        $string = str_replace_first($openingConstruct.$value.$closingConstruct, $store[1][$key], $string);
    }

    return $string;
}

# Similar to str_replace, but only replaces the first instance of the needle
function str_replace_first($find, $replace, $string)
{
    # Ensure we are dealing with arrays
    if(!is_array($find))
    {
        $find = array($find);
    }

    if(!is_array($replace))
    {
        $replace = array($replace);
    }

    foreach($find as $key => $value)
    {
        if(($pos = strpos($string, $value)) !== false)
        {
            # If we have no replacement make it empty
            if(!isset($replace[$key]))
            {
                $replace[$key] = '';
            }

            $string = mb_substr($string, 0, $pos).$replace[$key].mb_substr($string, $pos + mb_strlen($value));
        }
    }

    return $string;
}

# Finds all instances of a needle in the haystack and returns the array
function strpos_all($haystack, $needle)
{
    $offset = 0;
    $i      = 0;
    $return = false;
   
    while(is_integer($i))
    {  
        $i = strpos($haystack, $needle, $offset);
       
        if(is_integer($i))
        {
            $return[]   = $i;
            $offset     = $i + mb_strlen($needle);
        }
    }

    return $return;
}

And an example:

$string = '{{A {{simple|basic}} example|An uncomplicated scenario|The {{simplest|trivial|fundamental|rudimentary}} case|My {{test|invest{{igative|igation}}}} case}} to illustrate the {{function|problem}}';

echo '<p>';

for($i = 1; $i <= 5; $i++)
{
    echo spin($string, false).'<br />';
}

echo '</p>';

Which produces:

A basic example to illustrate the function
My test case to illustrate the problem
The rudimentary case to illustrate the function
An uncomplicated scenario to illustrate the problem
The fundamental case to illustrate the problem

As before I’m sure that this one isn’t perfect, but perhaps it will provide inspiration to someone else. All corrections / comments are welcome.

  1. #1 by croaker on October 21st, 2009

    Nice improvement, I like it better than the original though I tweaked mine back to just using single brackets (‘{‘,’}'). Next improvement is to allow for nested brackets

  2. #2 by Zack Katz on December 30th, 2009

    Hi, great code! I was wondering if you could update it to work with nested brackets…that would be IDEAL. Thanks!

  3. #3 by Paul Norman on December 30th, 2009

    Zack Katz :

    Hi, great code! I was wondering if you could update it to work with nested brackets…that would be IDEAL. Thanks!

    Hi Zack, thanks. I have put together an updated version of this for you. It’s a bit of a rush job, but should give you a base to work from. Enjoy!

  4. #4 by Zack Katz on January 11th, 2010

    Hi Paul,
    Thanks a lot, I look forward to trying out this code. I’ve been using the original code with

    ob_start()

    , which has worked very well.

    Thanks for your help!

  5. #5 by Peter on January 15th, 2010

    Great addition of the nested brackets. I have been trying to figure that one out (looking into recussion and such).

    How how do you handle apostrophes or quotes? ha…

  6. #6 by Peter on January 15th, 2010

    So yeah, I figured it out the apostrophe or quote thign… use heredocs instead of quotes.

    http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.heredoc

    Works great!!

  7. #7 by Paul Norman on January 15th, 2010

    Peter :

    So yeah, I figured it out the apostrophe or quote thign… use heredocs instead of quotes.

    Or you can just escape the quotes:

    $string = 'My string with "double" and \'single\' quotes...';
  8. #8 by Marcus on February 2nd, 2010

    Well done, great effort!

  9. #9 by Vince on March 8th, 2010

    There is a PHP recursive pattern incantation that might be useful: http://php.net/manual/en/regexp.reference.recursive.php

  10. #10 by Trust on March 18th, 2010

    The nested version is brilliant!
    thank you very much

  11. #11 by Paul Norman on March 18th, 2010

    Vince :

    There is a PHP recursive pattern incantation that might be useful: http://php.net/manual/en/regexp.reference.recursive.php

    You’re right, I should, but personally I’ve never had any luck with the recursive functions – I always just end up running them too many times for it to be efficient. If you’d like to provide a better version please do and you can have the credit for it here!

  12. #12 by Jackson Hill on June 24th, 2010

    Some seo experts charge top dollars for website optimization

  13. #13 by alex on August 16th, 2010

    Awesome code.

    Is there anyway to modify it so it only spins the content once a day or once a week?

  14. #14 by Katie (South France Immobilier) Radisson on August 16th, 2010

    Hi Paul

    This looks really useful but I need some help as it seems to get stuck occasionally using this double nested format:

    {{Quick Hints for Learning French|Learning French to Quick and Effective Way|How to Learn French in Simple Steps That Will Take You From Beginner to Master|The Best Way To Learn French is Now Available and Easier Than Ever|Stop Procrastinating and Start Learning French With These Quick Hints}|{Handy Hints to Help You Learn French|Ways to Learn French Easily|Tips to Learn French Faster|Learning French Doesn’t Have to Be Hard|Simple Tricks to Learn French Quickly}|{Do You Need Help Learning French?|Sending Out An SOS for Learning French?|Desperately Seeking Help for Learning French?|Learning French? Has It Been A Hard Days Night?|Learning French and Can’t Get No Satisfaction?}|{Hints to Help You Learn French|A Strategy to Learn French|Learning French In a Nutshell|Learning French Is Fun and Easy|Foolproof Method to Learning French}|{Hints and Tips For Learning French|The Best Methods for Learning French|How to Make Your Study of French Easier and More Fun|Learning French Is Not as Hard as You Think|Quick Tips to Make Learning French Easier}}

    Any idea why it does that?

    Thanks

    Katie

  15. #15 by Paul Norman on August 16th, 2010

    @alex rather than modifying the class I would suggest caching the result for a day or week to achieve the effect that you are looking for. That is, build it out to a flat file or save it in the database if the filemtime() or database timestamp is above a certain value.

    @Katie you need double braces everywhere you are going to spin something, you have double ones at the start and single ones everywhere else. I think that’ll fix it. Look at my last example for clarification.

(will not be published)