In php, I will use curl to grab the remote site's web page.
$html=curl_exec($curl), but is it in html format. Is it
possible save it into only innerText without any tag like <html><div>
</div></html> by php other bulit-in function ??? Otherwise I need to do program to extract those tag text.
Please advise
Duncan
if (!empty($get_string)) $url .= '?' . $get_string; $curl = curl_init(); // HEADERS AND OPTIONS APPEAR TO BE A FIREFOX BROWSER REFERRED BY GOOGLE $header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; $header[] = "Cache-Control: max-age=0"; $header[] = "Connection: keep-alive"; $header[] = "Keep-Alive: 300"; $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; $header[] = "Accept-Language: en-us,en;q=0.5"; $header[] = "Pragma: "; // BROWSERS USUALLY LEAVE BLANK // SET THE CURL OPTIONS - SEE http://php.net/manual/en/function.curl-setopt.php curl_setopt( $curl, CURLOPT_URL, $url ); curl_setopt( $curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6' ); curl_setopt( $curl, CURLOPT_HTTPHEADER, $header ); curl_setopt( $curl, CURLOPT_REFERER, 'http://www.google.com' ); curl_setopt( $curl, CURLOPT_ENCODING, 'gzip,deflate' ); curl_setopt( $curl, CURLOPT_AUTOREFERER, TRUE ); curl_setopt( $curl, CURLOPT_RETURNTRANSFER, TRUE ); // curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, TRUE ); curl_setopt( $curl, CURLOPT_TIMEOUT, $timeout ); // RUN THE CURL REQUEST AND GET THE RESULTS $htm = curl_exec($curl);
1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:
Select All Code