This is a URL-scanner, that grabs the url from a selected comment on a blog, via direct urls. Using anchors to find the right comment.
Example: You want to see the posted link for this comment.
The command, to find the link for the post is in this case:
GetCommentLink(”http://tmm.tornevall.net/blog/2010/03/14/sa-gor-media-dig-pissed-off-lattast/#comment-2667″)
return(array($getlink)); // Found link successfully – returning link as an array.
The rest is numerics, if we fail – see the source!
OBSERVE
All blogs has their own design. This script does not cover them all.
Sourcecode
<?php
/*
* Wordpress CommentLink-Scanner v1.0.1
*
* Requirements:
*
* 1: Something to download with. This script uses TorneEngine, where CURL makes all the job
* with that. Just make sure you can grab content from urls.
*
* 2: The webpage "must" contain a "commentheader", where the poster sent a comment.
* Tested live with some wordpress-blogs and all of them returned correct links if
* they exist. Just have in mind that the script is a bit dependent on the design
* on the blog you are using it, so feel free to change the code so it fits your
* needs. I'll try to fix this little issue later...
*
*
* Failcodes
*
* 999 - No anchor.
* 996 - No, this link seems damaged.
* 998 - Yes, I found a href, but the link may be damaged
* 997 - No, there.s no link here
* 995 - No matching anchor.
*
*/
function GetCommentLink($redirlink = '')
{
// Load TorneEngine
global $tornevall;
// Check if the redirectlink contains an anchor. If not, ignore this!
if (preg_match("[#]", $redirlink))
{
// Grab the anchor in link
$anchor = preg_replace("[(.*?)\#(.*?)]", '$2', $redirlink);
}
else
{
//echo "Found no anchor in redirlink...\n";
return 999; // Return "no anchor"
}
// Grab the webpage with TorneEngine where the anchor should be found
$d = $tornevall->www->get($redirlink);
// If we got the anchor from user, start find the link...
if ($anchor)
{
// Find that anchor on page
$startanchorpos = stripos(strtolower($d), strtolower($anchor));
if ($startanchorpos > -1)
{
// Ok the anchor has been found, let's find the "commentheader" for this post
$anchordata = substr($d, $startanchorpos);
$commheader = $anchordata;
$commheader = preg_replace("/(.*?)commentheader(.*?)<\/div>(.*)/si", '$2', $anchordata);
if (preg_match("/<a(.*?)href(.*?)>/i", $commheader))
{
// If we found the commentheader, no use a sloppy try to get the link
$getlink = preg_replace("/(.*?)<a(.*?)href(.*?)>(.*?)<\/a>(.*)/si", '$3', $commheader);
// Loop the process until we find a delimiter
while ($strbrk == false)
{
$overflowguard++;
if ($overflowguard >= 20) {break;}
$getlinktest = substr($getlink, 0, 1);
$getlink = substr($getlink, 1);
// Find the valid delimiter here
if ($getlinktest == "'" || $getlinktest=='"')
{
$strbrk = true;
break;
}
// After 20 characters we may consider the loop as a failure
}
// Use that delimiter to find the ending one...
if ($getlinktest && strpos($getlink, $getlinktest) > -1)
{
$getlink = substr($getlink, 0, strpos($getlink, $getlinktest));
}
else
{
// if we don't, this lookup has failed. This will probably only
// happen if the webdesigner is really crappy and something is very
// wrong with the webpage we're watching...
$faillink = true;
}
if (!$faillink)
{
// When we get our link, return it
$linktest = explode(" ", $getlink);
if (sizeof($linktest) == 1)
{
// Return the link, but as an array
return(array($getlink));
}
else
{
return 996; // Return "no, this link seems damaged"
}
}
else
{
return 998; // Return "yes, I found a href, but the link may be damaged"
}
}
else
{
return 997; // Return "no, there's no link here"
}
}
else
{
return 995; // Return "no matchong anchor"
}
}
}
?>


Comments