Half an IP address and the fix for the Yahoo Search API 999 error

The Yahoo Search API has been driving me nuts.  I started to get 999 errors on all the calls I was making from a new app I was developing (“999 Rate Limit Exceeded” was returned in the response).  What was really driving me nuts was that I was only making a handful of calls per day – I knew I wasn’t exceeding the 5000 rate limit advertised by Yahoo for the Search API.

FYI – skip to the end for the fix…

Since the rate limit is applied per IP address I convinced myself that my problem was due to somebody else’s app running on the same server.  Not ideal but it would be a downside of shared hosting.  So I immediately signed up for a unique IP.  The next day was spent figuring out why this didn’t make any difference.  Turns out a unique IP on shared hosting isn’t actually a complete IP – it’s only half an IP.  Unique/dedicated IPs are generally provided on shared hosting for SSL.  This only requires incoming requests for your site to be directed to a unique IP address.  However outgoing requests (eg from a PHP script) do *not* originate from the dedicated IP but from the IP address of the shared Apache instance.  Ugh.  Understandable in hindsight but it totally didn’t meet my expectations.  The upgrade to a private server or virtual private server would get me a “whole” dedicated IP… maybe when this app starts making cash I’ll spring for it.

Anyway, after figuring out the above, spending an hour setting up routes through my various firewalls, installing a proxy (CCProxy worked perfectly) I was able to pass the request through my laptop and hence originate it from a different IP address which I knew for sure wouldn’t be rate limited.

Guess what.  Still getting 999 errors.  However copying the URL to my browser returns the entire set of results immediately.  Same IP same request, no?  Ugh!  Something crucial had to be different between the request coming from my browser and my script.  After experimenting with every cURL option in the known universe I finally figured out the undocumented API requirement.

The Yahoo Search API now requires the HTTP User Agent to be set.  Eg:
curl_setopt($session,CURLOPT_USERAGENT,"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)");

This appears to be a new undocumented requirement but it does match the User Agent requirement listed against the shopping API.  I tried various User Agent strings and just about anything seems to be accepted but Yahoo suggests faking a commonly used User Agent string for the shopping API so it’s probably best to stick with a browser UA.

Anyway, worked like a treat.  The stupid little things always suck up the most time.

8 Responses to “Half an IP address and the fix for the Yahoo Search API 999 error”

  1. Hemachander Says:

    I got same problem for backlinks using yahoo api. It does not returns any results through php code but when i try in browser with the yahoo api url it works and it returns default 50 urls.
    Please have a look my code and suggest or correct my code. Thank you.

    Link Text Explorer

    #sortby {
    display:none;
    }

    Logged in as | Logout

    Location:

    Tools
    >
    Backlinks
    >
    Link Text Explorer

    Link Text Explorer
    The title, link text and number of external links is retrieved from each page, together with the page PageRank and domain PageRank (if required).

    <form action="" method="get">

    URL<input type="text" name="query" size="60" value="" />
    Results

    <option value="" >

    PageRank
    <input type="checkbox" name="pagepr" id="page" />Page 
    <input type="checkbox" name="domainpr" id="domain" />Domain

    <?php

    function get_html ($url) {
    $html = "";
    $timeout = 10;
    ini_set('user_agent','Mozilla: (compatible; Windows XP)');
    $old = ini_set('default_socket_timeout', $timeout);
    $fh = fopen($url, 'r');
    if ($fh) {
    ini_set('default_socket_timeout', $old);
    stream_set_timeout($fh, $timeout);
    stream_set_blocking($fh, 0);
    while (! feof($fh)) {
    $html .= fread($fh, 4096);
    }
    fclose($fh);
    return $html;
    } else {
    return 0;
    }
    }

    function get_pageinfo($url, $query) {
    $html = get_html ($url);
    if ($html) {
    $pageinfo['success'] = 1;
    $html = preg_replace('/\n/', ' ', $html);
    $pattern = '##im';
    preg_match_all($pattern, $html, $matches);
    $pageinfo['ExternalLinks'] = 0;
    $linktext = "";
    if ($matches) {
    foreach ($matches[0] as $match) {
    if ( preg_match("#$query#i", $match) ) {
    $text = "";
    if ( preg_match('#<img#im', $match) ) {
    $text .= "[IMG]";
    if ( preg_match('#alt\s*=\s*"(.*?)"#im', $match, $alt) ) {
    $text .= " $alt[1]";
    }
    } else {
    $text = strip_tags($match);
    }
    $pageinfo['LinkText'] = $text;
    }
    if ( preg_match("#http://#i", $match) ) {
    $pageinfo['ExternalLinks']++;
    }
    }
    }
    } else {
    $pageinfo['success'] = 0;
    }
    return $pageinfo;
    }

    if ( isset($_GET['query']) ) {

    require_once ('pagerank2.php');
    ?>
    Results
    Notes

    [IMG] denotes link is an image. The text following is the ALT text.
    [X] denotes the page could not be loaded.

    <?php

    flush();

    echo "$query has a PageRank of ".trim(getrank($query)).".\n";
    flush();
    #mryahoodemo
    $params = array( "appid" => "mryahoodemo",
    "query" => $query,
    "results" => $num,
    "start" => $start,
    "omit_inlinks" => "domain"
    );

    $request = "";
    foreach ($params as $param => $value) {
    $request .= "$param=$value&";
    }

    $yahoo_api ="http://api.search.yahoo.com/WebSearchService/V1/webSearch?";
    #$yahoo_api="http://search.yahooapis.com/SiteExplorerService/V1/inlinkData?";
    #$yahoo_api = "http://api.search.yahoo.com/SiteExplorerService/V1/inlinkData?";
    $ResultSet = simplexml_load_file ( urlencode($yahoo_api.$request) );

    if ($ResultSet) {

    $totalResultsAvailable = $ResultSet['totalResultsAvailable'];
    $totalResultsReturned = $ResultSet['totalResultsReturned'];
    $firstResultPosition = $ResultSet['firstResultPosition'];
    $lastResultPosition = $firstResultPosition + $params['results'] - 1;

    if ($totalResultsReturned > $totalResultsAvailable) {
    $lastResultPosition = $totalResultsAvailable;
    }

    echo "Results $firstResultPosition - $lastResultPosition of about ".number_format($totalResultsAvailable)." for $query.\n";
    ?>

    Sort by:  

    External Links 

    Page PageRank 

    Domain PageRank

    #
    URL

    Visit
    Title
    Link Text
    External Links

    Page PageRank

    Domain PageRank

    Result as $Result) {

    // InLinks
    $Title = $Result->Title;
    $Url = $Result->Url;
    $ClickUrl = $Result->ClickUrl;

    // PageInfo
    $PageInfo = get_pageinfo($Url, $query);
    if ($PageInfo['success']) {
    $success = "";
    } else {
    $success = "[X]";
    }

    // Link Text
    $LinkText = "";
    if ($PageInfo['success']) {
    if (isset($PageInfo['LinkText'])) {
    $LinkText = $PageInfo['LinkText'];
    } else {
    $LinkText = "[NULL]";
    }
    }
    if ($LinkText != "[NULL]") {

    if (++$count%2) { $class = "even"; } else { $class = "odd"; }
    echo "";
    echo "".$count."\n";
    echo "".preg_replace('#^http://#', '', wordwrap($Url, 25, " ", 1))."";
    echo "".$success."";
    echo "
    [visit]";
    echo "".$Title."";

    // External Links
    $ExternalLinks = $PageInfo['ExternalLinks'];
    echo "".$LinkText."";
    echo "".$ExternalLinks."";

    // Page PageRank
    if ($pagepr) {
    $PageRank = getrank($Url);
    if (!isset($PageRank)) {$PageRank = 0;}
    echo "".$PageRank."";
    }

    // Domain PageRank
    if ($domainpr) {
    $domain_array = explode("/", preg_replace('#^http[s]*://#', '', $Url));
    $domain = $domain_array[0];
    $PageRank2 = getrank($domain);
    if (!isset($PageRank2)) {$PageRank2 = 0;}
    echo "".$PageRank2."";
    }
    flush();

    }
    }
    ?>

    var results = new SortableTable(document.getElementById("results_table"),
    ["Number", "CaseInsensitiveString", "CaseInsensitiveString", "CaseInsensitiveString","CaseInsensitiveString", "CaseInsensitiveString", "Number", "Number", "Number"]);
    document.getElementById("sortby").style.display = "block";
    document.getElementById("progress").style.display = "none";

    function addClassName(el, sClassName) {
    var s = el.className;
    var p = s.split(" ");
    var l = p.length;
    for (var i = 0; i < l; i++) {
    if (p[i] == sClassName)
    return;
    }
    p[p.length] = sClassName;
    el.className = p.join(" ").replace( /(^\s+)|(\s+$)/g, "" );
    }

    function removeClassName(el, sClassName) {
    var s = el.className;
    var p = s.split(" ");
    var np = [];
    var l = p.length;
    var j = 0;
    for (var i = 0; i < l; i++) {
    if (p[i] != sClassName)
    np[j++] = p[i];
    }
    el.className = np.join(" ").replace( /(^\s+)|(\s+$)/g, "" );
    }

    results.onsort = function () {
    var rows = this.tBody.rows;
    var l = rows.length;
    for (var i = 0; i < l; i++) {
    removeClassName(rows[i], i % 2 ? "even" : "odd");
    addClassName(rows[i], i % 2 ? "odd" : "even");
    }
    };

    © Intelligent Positioning

  2. Ryan Says:

    Thanks, this saved me a ton of time troubleshooting this bug!

  3. Jonathan Says:

    Hemachander, you have a lot of functionality in this one file. I suggest you break it down and output the raw results you’re getting back from the call – including the HTTP code to see what’s going on.

  4. Jonathan Says:

    Ryan – no problem – glad it helped!

  5. James Says:

    Hi Jonathan,

    Well done for finding the solution. I had a feeling it could be the User Agent but I had it set at Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0) which, after a few calls, gave the dreaded 999 response. Your user agent fixed it. Cheers!

  6. mmfoscar Says:

    Great bug tracking (Y)

    Thanks a lot for the post.

  7. Jonathan Says:

    James and mmfoscar – thanks for leaving a comment – glad it helped!!

  8. Internet marketingtips Says:

    Internet marketingtricks…

    [...]Affiliate Marketing » Half an IP address and the fix for the Yahoo Search API 999 error » Affiliates on Fire » Blog Archive[...]…

Leave a Reply