robots.txt과 웹 로봇의 크롤링

robots.txt에 크롤러가 웹 크롤링을 하는 것을 막을 수 있다.

물론 이것을 어기고 할 수도 있지만...

http://www.google.com/robots.txt

User-agent: *
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalogues
Disallow: /news
Allow: /news/directory
Disallow: /nwshp
Disallow: /setnewsprefs?
Disallow: /index.html?

네이버에 경우 모든 것을 하지 않기를 원한다

http://me.naver.com/robots.txt

user-agent:*
disallow:/

저작자표시 비영리 동일조건

'Network' 카테고리의 다른 글

IMAP(Internet Message Access Protocol) (0)	2012.09.21
WebDAV(Web Distributed Authoring and Versioning) (0)	2012.09.21
HTTP(Hypertext Transfer Protocol) (0)	2012.09.21
FTP(File Transfer Protocol) (0)	2012.09.21
DNS(Domain Name System) (0)	2012.09.21

나모의 노트

robots.txt과 웹 로봇의 크롤링

'Network' 카테고리의 다른 글

티스토리툴바

robots.txt과 웹 로봇의 크롤링

'Network' 카테고리의 다른 글

'Network' Related Articles

티스토리툴바