<?xml version="1.0" encoding='utf-8'?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="card1" title="Web crawler - Page 10 - Wikipedia">
<p>
<a accesskey="1" href="page.php?w=web_crawler&amp;p=9">1.Previous</a><br />
<a accesskey="3" href="page.php?w=web_crawler&amp;p=11">3.Next</a>
</p>
<p>* a re-visit policy which states when to check for changes to the pages<br/>
* a politeness policy that states how to avoid overloading <a href="page.php?w=websites">websites</a><br/>
* a parallelization policy that states how to coordinate distributed web crawlers</p>

<p><big>Selection policy</big></p>
<p>Given the current size of the Web, even large search engines cover only a portion of the publicly available part. A 2009 study showed even large-scale <a href="page.php?w=search_engines">search engines</a> index no more than 40-70% of the indexable</p><p>
<a accesskey="1" href="page.php?w=web_crawler&amp;p=9">1.Previous</a><br />
<a accesskey="3" href="page.php?w=web_crawler&amp;p=11">3.Next</a>
</p>

<do type="prev" label="Search">
        <go href="search.wml"/>
</do>

</card>
</wml>
