PHP采集程序常用的采集函数收藏

这几天关注了一下PHP的采集程序,才发现用PHP采集内容是这么方便,把经常用到的采集函数在这里总结一下,方便以后使用.

在php采集页面中最常用的就是过滤一些特殊字符或把内容中的图片也采集保存下来,下面我来给大家介绍我在写php采集程序时一些常用的函数.

  1. 获取所有链接内容和地址 
  2. function getAllURL($code){ 
  3. preg_match_all('/<as+href=["|']?([^>"' ]+)["|']?s*[^>]*>([^>]+)</a>/i',$code,$arr); 
  4. return array('name'=>$arr[2],'url'=>$arr[1]); 
  5. 获取所有的图片地址 
  6. function getImgSrc($code){ 
  7. $reg = "/]*src="(http://(.+)/(.+).(jpg|gif|bmp|bnp|png))"/isU"; 
  8. preg_match_all($reg$code$img_array, PREG_PATTERN_ORDER); 
  9. return $img_array[1]; 
  10. 当前的脚本网址 
  11. function getSelfURL(){ 
  12. if(!emptyempty($_SERVER["REQUEST_URI"])){ 
  13. $scriptName = $_SERVER["REQUEST_URI"]; 
  14. $nowurl = $scriptName
  15. }else
  16. $scriptName = $_SERVER["PHP_SELF"]; 
  17. if(emptyempty($_SERVER["QUERY_STRING"])) $nowurl = $scriptName
  18. else $nowurl = $scriptName."?".$_SERVER["QUERY_STRING"]; 
  19. return $nowurl
  20. 把全角数字转为半角数字 
  21. function getAlabNum($fnum){ 
  22. $nums = array("0","1","2","3","4","5","6","7","8","9"); 
  23. $fnums = "0123456789"
  24. for($i=0;$i<=9;$i++) $fnum = str_replace($nums[$i],$fnums[$i],$fnum); 
  25. $fnum = ereg_replace("[^0-9.]|^0{1,}","",$fnum); 
  26. if($fnum==""$fnum=0; 
  27. return $fnum
  28. 去除HTML标记 
  29. function text2Html($txt){ 
  30. $txt = str_replace(" "," ",$txt); 
  31. $txt = str_replace("<","<",$txt); 
  32. $txt = str_replace(">",">",$txt); 
  33. $txt = preg_replace("/[rn]{1,}/isU","<br/>rn",$txt); 
  34. return $txt
  35. 清除HTML标记 
  36. function clearHtml($str){ 
  37. $str = str_replace('<','<',$str); 
  38. $str = str_replace('>','>',$str); 
  39. return $str
  40. 相对路径转化成绝对路径 
  41. function relative2Absolute($content$feed_url) { 
  42. preg_match('/(http|https|ftp):///'$feed_url$protocol); 
  43. $server_url = preg_replace("/(http|https|ftp|news):///"""$feed_url); 
  44. $server_url = preg_replace("//.*/"""$server_url); 
  45. if ($server_url == '') { 
  46. return $content
  47. if (isset($protocol[0])) { 
  48. $new_content = preg_replace('/href="//''href="'.$protocol[0].$server_url.'/'$content); 
  49. $new_content = preg_replace('/src="//''src="'.$protocol[0].$server_url.'/'$new_content); 
  50. else { 
  51. $new_content = $content
  52. return $new_content
  53. 获取指定标记中的内容 
  54. function getTagData($str$start$end){ 
  55. if ( $start == '' || $end == '' ){ 
  56. return
  57. $str = explode($start$str); 
  58. $str = explode($end$str[1]); 
  59. return $str[0]; 
  60. HTML表格的每行转为CSV格式数组 
  61. function getTrArray($table) { 
  62. $table = preg_replace("'<td[^>]*?>'si",'"',$table); 
  63. $table = str_replace("</td>",'",',$table); 
  64. $table = str_replace("</tr>","{tr}",$table); 
  65. //去掉 HTML 标记 
  66. $table = preg_replace("'<[/!]*?[^<>]*?>'si","",$table); 
  67. //去掉空白字符 
  68. $table = preg_replace("'([rn])[s]+'","",$table); 
  69. $table = str_replace(" ","",$table); 
  70. $table = str_replace(" ","",$table); 
  71. $table = explode(",{tr}",$table); 
  72. array_pop($table); 
  73. return $table
  74. 将HTML表格的每行每列转为数组,采集表格数据 
  75. function getTdArray($table) { 
  76. $table = preg_replace("'<table[^>]*?>'si","",$table); 
  77. $table = preg_replace("'<tr[^>]*?>'si","",$table); 
  78. $table = preg_replace("'<td[^>]*?>'si","",$table); 
  79. $table = str_replace("</tr>","{tr}",$table); 
  80. $table = str_replace("</td>","{td}",$table); 
  81. //去掉 HTML 标记 
  82. $table = preg_replace("'<[/!]*?[^<>]*?>'si","",$table); 
  83. //去掉空白字符 
  84. $table = preg_replace("'([rn])[s]+'","",$table); 
  85. $table = str_replace(" ","",$table); 
  86. $table = str_replace(" ","",$table); 
  87. $table = explode('{tr}'$table); 
  88. array_pop($table); 
  89. foreach ($table as $key=>$tr) { 
  90. $td = explode('{td}'$tr); 
  91. array_pop($td); 
  92. $td_array[] = $td
  93. return $td_array
  94. 返回字符串中的所有单词 $distinct=true 去除重复 
  95. function splitEnStr($str,$distinct=true) { 
  96. preg_match_all('/([a-zA-Z]+)/',$str,$match); 
  97. if ($distinct == true) { 
  98. $match[1] = array_unique($match[1]); 
  99. sort($match[1]); 
  100. return $match[1]; 
波比源码 – 精品源码模版分享 | www.bobi11.com
1. 本站所有资源来源于用户上传和网络,如有侵权请邮件联系站长!
2. 分享目的仅供大家学习和交流,您必须在下载后24小时内删除!
3. 不得使用于非法商业用途,不得违反国家法律。否则后果自负!
4. 本站提供的源码、模板、插件等等其他资源,都不包含技术服务请大家谅解!
5. 如有链接无法下载、失效或广告,请联系管理员处理!
6. 本站资源售价只是赞助,收取费用仅维持本站的日常运营所需!
7. 本站源码并不保证全部能正常使用,仅供有技术基础的人学习研究,请谨慎下载
8. 如遇到加密压缩包,请使用WINRAR解压,如遇到无法解压的请联系管理员!

波比源码 » PHP采集程序常用的采集函数收藏

183 评论

  1. order anastrozole generic clarithromycin uk viagra 100mg pills for men

  2. buy isotretinoin 20mg sale amoxil 250mg uk stromectol price uk

  3. buy altace 5mg for sale cost altace astelin 10 ml over the counter

  4. salbutamol 100 mcg for sale cost imuran viagra 50mg sale

  5. zyprexa over the counter diovan order order valsartan 160mg sale

  6. deltasone sale minipress order order mebendazole 100mg without prescription

  7. coreg 25mg over the counter elavil 10mg oral buy amitriptyline 50mg pill

  8. After all, what a great site and informative posts, I will upload inbound link – bookmark this web site? Regards, Reader. Metropol Halı Karaca Halı Öztekin ve Selçuklu Halı Cami Halısı ve Cami Halıları Türkiye’nin En Büyük Cami Halısı Fabrikasıyız…

  9. purchase zyrtec without prescription cetirizine 5mg ca sertraline 50mg for sale

  10. buy tadalafil 10mg online cheap tadalafil low price buy generic ed pills online

  11. ivermectin lotion for scabies buy prednisone isotretinoin online order

  12. order isotretinoin 40mg online cheap accutane 40mg pill zithromax without prescription

  13. clomiphene online buy clomid pill plaquenil canada

  14. synthroid without prescription vardenafil generic vardenafil where to buy

  15. lisinopril 5mg pills prilosec 10mg uk buy lopressor 100mg online

  16. ipratropium 100 mcg without prescription decadron brand purchase linezolid pills

  17. buy duricef 250mg online cheap propecia 1mg usa purchase propecia sale

  18. buy generic minocycline for sale oral hytrin 1mg purchase actos for sale

  19. order vardenafil 20mg without prescription order tizanidine 2mg order plaquenil generic

  20. can i buy viagra over the counter If you have more questions about LASIK or eye care during pregnancy, contact us today or schedule a consultation by calling 866 742 6581

  21. purchase temovate without prescription buspar 10mg brand order amiodarone 200mg without prescription

  22. propecia for women This protein serves as a transporter for thyroxin and for retinol binding protein

  23. where to buy ampicillin without a prescription cipro 500mg drug buy metronidazole without a prescription

  24. methocarbamol 500mg without prescription desyrel ca suhagra cost

  25. Do you mind if I quote a couple of your posts as long as I provide credit and sources back to your site? My website is in the very same area of interest as yours and my visitors would certainly benefit from a lot of the information you present here. Please let me know if this okay with you. Regards!

  26. медицинская справка 2023

  27. Hi there to all, the contents present at this website are truly awesome for people experience, well, keep up the nice work fellows.

  28. Hoping for another baby girl in 2016 17 best price for generic cialis Bumetanide is used to reduce extra fluid in the body edema caused by conditions such as congestive heart failure, liver disease, and kidney disease

  29. You’re so cool! I don’t suppose I have read something like this before. So great to find somebody with some unique thoughts on this topic. Really.. thank you for starting this up. This site is something that’s needed on the web, someone with some originality!

  30. I believe what you postedtypedsaidbelieve what you postedtypedsaidthink what you postedtypedthink what you postedwroteWhat you postedtyped was very logicala bunch of sense. But, what about this?consider this, what if you were to write a killer headlinetitle?content?wrote a catchier title? I ain’t saying your content isn’t good.ain’t saying your content isn’t gooddon’t want to tell you how to run your blog, but what if you added a titleheadlinetitle that grabbed people’s attention?maybe get people’s attention?want more? I mean %BLOG_TITLE% is a little plain. You ought to look at Yahoo’s home page and see how they createwrite post headlines to get viewers interested. You might add a related video or a related pic or two to get readers interested about what you’ve written. Just my opinion, it could bring your postsblog a little livelier.

  31. I loved as much as you will receive carried out right here. The sketch is tasteful, your authored subject matter stylish. nonetheless, you command get bought an edginess over that you wish be delivering the following. unwell unquestionably come further formerly again since exactly the same nearly a lot often inside case you shield this increase.

  32. divalproex cost imdur cheap imdur for sale online

  33. Thanks for sharing your thoughts on %meta_keyword%. Regards

  34. medicine for impotence order sildenafil sildenafil fast shipping

  35. Wow that was odd. I just wrote an very long comment but after I clicked submit my comment didn’t show up. Grrrr… well I’m not writing all that over again. Regardless, just wanted to say wonderful blog!

  36. This is my first time pay a visit at here and i am truly impressed to read all at one place.

  37. Fantastic beat ! I wish to apprentice at the same time as you amend your web site, how can i subscribe for a blog site? The account aided me a appropriate deal. I have been tiny bit familiar of this your broadcast provided brilliant transparent concept

  38. Post writing is also a fun, if you know after that you can write otherwise it is difficult to write.

  39. buy avlosulfon 100mg online cheap buy avlosulfon paypal buy aceon 8mg without prescription

  40. An interesting discussion is worth comment. I believe that you ought to write more on this issue, it might not be a taboo subject but usually people don’t discuss such topics. To the next! Many thanks!!

  41. Hey just wanted to give you a quick heads up and let you know a few of the images aren’t loading correctly. I’m not sure why but I think its a linking issue. I’ve tried it in two different internet browsers and both show the same results.

  42. What’s up mates, good piece of writing and nice arguments commented here, I am really enjoying by these.

  43. Hiya! I know this is kinda off topic however , I’d figured I’d ask. Would you be interested in exchanging links or maybe guest writing a blog article or vice-versa? My website addresses a lot of the same subjects as yours and I feel we could greatly benefit from each other. If you might be interested feel free to send me an e-mail. I look forward to hearing from you! Excellent blog by the way!

  44. When I originally commented I clicked the “Notify me when new comments are added” checkbox and now each time a comment is added I get four emails with the same comment. Is there any way you can remove me from that service? Cheers!

  45. Aw, this was a really nice post. Spending some time and actual effort to produce a great article but what can I say I procrastinate a lot and never seem to get anything done.

  46. I have been surfing online more than 2 hours today, yet I never found any interesting article like yours. It’s pretty worth enough for me. Personally, if all website owners and bloggers made good content as you did, the net will be much more useful than ever before.

  47. Wonderful, what a webpage it is! This blog gives useful data to us, keep it up.

  48. Hey There. I found your blog using msn. This is a very well written article. I will be sure to bookmark it and come back to read more of your useful information. Thanks for the post. I will definitely comeback.

  49. cost nitrofurantoin 100 mg pamelor oral pamelor 25 mg uk

  50. Have you ever considered about including a little bit more than just your articles? I mean, what you say is valuable and all. However think about if you added some great pictures or video clips to give your posts more, “pop”! Your content is excellent but with images and clips, this website could undeniably be one of the very best in its niche. Fantastic blog!

  51. I am sure this post has touched all the internet users, its really really good post on building up new weblog.

  52. glipizide 10mg tablet glucotrol ca betamethasone 20 gm usa

  53. Thank you a bunch for sharing this with all folks you really realize what you are talking approximately! Bookmarked. Please also discuss with my web site =). We could have a link trade contract among us

  54. Aw, this was an incredibly nice post. Finding the time and actual effort to make a really good article but what can I say I put things off a lot and never seem to get anything done.

  55. Today, I went to the beachfront with my kids. I found a sea shell and gave it to my 4 year old daughter and said “You can hear the ocean if you put this to your ear.” She put the shell to her ear and screamed. There was a hermit crab inside and it pinched her ear. She never wants to go back! LoL I know this is completely off topic but I had to tell someone!

  56. As the admin of this site is working, no hesitation very soon it will be famous, due to its quality contents.

  57. A person necessarily help to make significantly articles I would state. This is the first time I frequented your web page and so far? I amazed with the research you made to create this actual post amazing. Great activity!

  58. Informative article, exactly what I needed.

  59. Hi there I am so happy I found your web site, I really found you by error, while I was researching on Bing for something else, Nonetheless I am here now and would just like to say many thanks for a remarkable post and a all round enjoyable blog (I also love the theme/design), I don’t have time to browse it all at the minute but I have book-marked it and also included your RSS feeds, so when I have time I will be back to read much more, Please do keep up the fantastic job.

  60. order zyban 150 mg pills zyrtec 5mg cost order generic strattera 25mg

  61. I’ve been exploring for a little bit for any high-quality articles or blog posts in this kind of space . Exploring in Yahoo I finally stumbled upon this web site. Reading this info So i’m satisfied to exhibit that I have a very just right uncanny feeling I found out exactly what I needed. I so much for sure will make certain to don?t put out of your mind this site and give it a look on a constant basis.

  62. Excellent web site you’ve got here.. It’s hard to find quality writing like yours these days. I seriously appreciate people like you! Take care!!

  63. Do you have a spam issue on this site; I also am a blogger, and I was curious about your situation; many of us have created some nice procedures and we are looking to swap strategies with other folks, why not shoot me an e-mail if interested.

  64. Hello there! This article couldn’t be written any better! Looking at this post reminds me of my previous roommate! He always kept talking about this. I will forward this article to him. Pretty sure he’ll have a very good read. Thank you for sharing!

  65. Hiya very nice website!! Guy .. Beautiful .. Superb .. I will bookmark your blog and take the feeds also? I am glad to seek out so many useful information here in the publish, we need develop more strategies in this regard, thank you for sharing. . . . . .

  66. I know this if off topic but I’m looking into starting my own blog and was wondering what all is required to get set up? I’m assuming having a blog like yours would cost a pretty penny? I’m not very internet savvy so I’m not 100% positive. Any recommendations or advice would be greatly appreciated. Thank you

  67. If you are going for best contents like me, simply pay a visit this web site everyday as it gives quality contents, thanks

评论已关闭

Hi, 如果你对这款模板有疑问,可以跟我联系哦!

联系站长
赞助VIP 享更多特权,建议使用 QQ 登录
喜欢我嘛?喜欢就按“ctrl+D”收藏我吧!♡