我有一个PHP页面,我每分钟都通过CRON作业运行.
我已经运行了很长一段时间但突然间它开始抛出这些错误:
Maximum execution time of 30 seconds exceeded in /home2/sharingi/public_html/scrape/functions.php on line 84
行号将随每个错误而变化,范围从第70行到90年代.
这是第0-95行的代码
function crawl_page( $base_url, $target_url, $userAgent, $links) { $ch = curl_init(); curl_setopt($ch, CURLOPT_USERAGENT, $userAgent); curl_setopt($ch, CURLOPT_URL,$target_url); curl_setopt($ch, CURLOPT_FAILONERROR, false); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); curl_setopt($ch, CURLOPT_TIMEOUT, 100); curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops $html = curl_exec($ch); if (!$html) { echo "
cURL error number:" .curl_errno($ch); echo "
cURL error:" . curl_error($ch); //exit; } // // load scrapped data into the DOM // $dom = new DOMDocument(); @$dom->loadHTML($html); // // get only LINKS from the DOM with XPath // $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//a"); // // go through all the links and store to db or whatever // for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); $url = $href->getAttribute('href'); //if the $url does not contain the web site base address: http://www.thesite.com/ then add it onto the front $clean_link = clean_url( $base_url, $url, $target_url); $clean_link = str_replace( "http://" , "" , $clean_link); $clean_link = str_replace( "//" , "/" , $clean_link); $links[] = $clean_link; //removes empty array values foreach($links as $key => $value) { if($value == "") { unset($links[$key]); } } $links = array_values($links); //removes javascript lines foreach ($links as $key => $value) { if ( strpos( $value , "javascript:") !== FALSE ) { unset($links[$key]); } } $links = array_values($links); // removes @ lines (email) foreach ($links as $key => $value) { if ( strpos( $value , "@") !== FALSE || strpos( $value, 'mailto:') !== FALSE) { unset($links[$key]); } } $links = array_values($links); } return $links; }
造成这些错误的原因是什么,我该如何预防?
您应该使用set_time_limit函数设置max_execution时间.如果您想要无限时间(最有可能是您的情况),请使用:
set_time_limit(0);