Site details
Name | Magpie |
Short name | MAG |
URL | http://magpie-data.magpie.net |
Update frequency How often should indexing and category extraction be performed | 00:05:00 |
Process As RSS? | false |
Index Raw Content? | false |
Content Expiry | Never expires |
eg. www.example.com/document/textonly
No URL filters - all parts of this site will be indexed.
|
No Domain Filters. Only pages from the same domain as the starting URL will be included.
|
Choose which methods to use to extract content from a page
No content extractors - the whole page will be used as content.
|
These are values passed to the server through a page's address.
Searchbox will ignore pages distinguished by URL parameters unless they are specified here.
eg. www.example.com?page=12
There are no URL parameters |
Searchbox will start spidering the site at these points eg: www.example.com/news, www.example.com/business or even www.example.com/
No starting points specified.
|
Site-category extraction methods
The site's taxonomy can be derived from its URLs or its metatags
Not looking for site categories
|
Categories are not being extracted from this site. To choose a category extraction technique click the edit button above. |
Site category to taxonomy mappings
No Subscription model is currently defined
|
Allows you to extract the title from certain pages
eg. <meta name="keywords"».*</meta»
No Title filters - default title extraction will be applied.
|
Specify whether a given page is to be ignored, eg. a 404 page that does not actually return the 404 code
eg. textonly
No Ignorable Page filters - all parts of this site will be indexed.
|
eg. textonly
No Address filters - all parts of the link URL will be used.
|
Change Filters allow you to specifies URLs that do not trigger alert hits
No Change filters - no URLs will be prevented from triggering alerts.
|