Tip #18. Why to use dedicated Web Server for crawling

Rate this item
(0 votes)

By default, Office SharePoint Server 2007 uses all of the front-end Web servers in the server farm to crawl content in the same server farm. The index server sends requests to each WFE server in the farm. The front-end Web servers get the requested content from the SharePoint sites in the farm and forward that content to the index server for indexing.

This works well for small-to-medium-size organizations. Large organizations, however, tend to crawl more content than small organizations. This can translate into hundreds of gigabytes or even terabytes of content that is being crawled and indexed, what places a heavy load on the front-end Web servers. Such load cause negative impact on the performance (CPU usage and up to 50% of traffic) of all front-end Web servers in your server farm.

To improve the overall performance in large farm configure a dedicated Web server for crawling content, especially if you are crawling a server farm that contains more than 500 gigabytes (GB) of content or if you are crawling content over the WAN. Consider removing this dedicated server from NLB routing

Update:

There are two scenarios, for dedicated web server.

  1. Using Index Sever with Web Role
  2. Using addtional WFE server

In case of using the same server as both the index server and dedicated Web server, you eliminate the need for the index server to send requests to a different server when crawling content. This reduces overall network traffic and improves crawl performance.

But there are some cases, when you should prefer WFE for Crawling, rather than Index server:

  • Another application (such as the Excel Calculation service) is running on the index server. WFE for crawling might prevent that application from communicating with other servers in the farm (move those applications to another application server before configuring a dedicated front-end Web server for crawling)
  • You want to use the index server as the dedicated front-end Web server for crawling and the index server is also configured as a query server.
  • The NetBios name of your query server is also the host name of your SharePoint site (In some cases, the timer service writes the incorrect IP address to your Hosts file) 

Take into account that in either of the preceding two cases, configuring a dedicated front-end Web server for crawling can prevent the index server from propagating the index to another server

Source

Leave a comment

Make sure you enter the (*) required information where indicated. HTML code is not allowed.