PHP采集程序中常用的函数

时间:2011-09-19 14:23:15 144次阅读 关键词: PHP 采集 函数

1.PHP获得当前URL的函数

function get_php_url() {
    if (! empty ( $_SERVER ["REQUEST_URI"] )) {
        $scriptName = $_SERVER ["REQUEST_URI"];
        $nowurl = $scriptName;
    } else {
        $scriptName = $_SERVER ["PHP_SELF"];
        if (empty ( $_SERVER ["QUERY_STRING"] ))
            $nowurl = $scriptName;
        else
            $nowurl = $scriptName . "?" . $_SERVER ["QUERY_STRING"];
    }
    return $nowurl;
}

2.PHP把全角数字转为半角数字

function GetAlabNum($fnum) {
    $nums = array ("0", "1", "2", "3", "4", "5", "6", "7", "8", "9" );
    $fnums = "0123456789";
    for($i = 0; $i <= 9; $i ++)
        $fnum = str_replace ( $nums [$i], $fnums [$i], $fnum );
    $fnum = ereg_replace ( "[^0-9\.]|^0{1,}", "", $fnum );
    if ($fnum == "")
        $fnum = 0;
    return $fnum;
}

3.PHP去除HTML标记

//去除HTML标记
function Text2Html($txt) {
    $txt = str_replace ( " ", " ", $txt );
    $txt = str_replace ( "<", "<", $txt );
    $txt = str_replace ( ">", ">", $txt );
    $txt = preg_replace ( "/[\r\n]{1,}/isU", "\r\n", $txt );
    return $txt;
}
//清除HTML标记
function ClearHtml($str) {
    $str = str_replace ( '<', '<', $str );
    $str = str_replace ( '>', '>', $str );
    return $str;
}

4.PHP获得页面的全部超链接

function get_all_url($code) {
    preg_match_all ( '/"\' ]+)["|\']?\s*[^>]*>([^>]+)<\/a>/i', $code, $arr );
    return array ('name' => $arr [2], url => $arr [1] );
}

5.PHP获取指定标记中的内容

function get_tag_data($str, $start, $end) {
    if ($start == '' || $end == '') {
        return;
    }
    $str = explode ( $start, $str );
    $str = explode ( $end, $str [1] );
    return $str [0];
}

原文转载于:http://www.wangchong.org/spider/10.html。

更多