If you've set up a scan in AppCheck then you will be familiar with the Targets box; it's usually the first thing you fill in when setting up a scan.
What you may not be aware of is another place to specify targets: Seeded Targets (this can be found under Web Application Scanner Settings -> Advanced Settings):
This article explains a little about how the types of targets define the behaviour of your web application scan.
- How Targets Define a Scan
- Seeded Targets
- Application Root
- What Should You Do If You Want To Specify Several Paths Within One Application?
- What Should You Do If You Do Not Want To Scan Everything Under / ?
- What Should You Do If You Want To Explicitly Exclude A Specific URL From A Scan?
How Targets Define a Scan
When a scan is launched, the first stage is a "crawl" of each application specified as a target.
Starting with the application's root (which is usually usually /), the scanner looks for hyperlinks in the returned HTML content and for URLs/paths mentioned in source code (including in scripts, frames etc).
The scanner then follows these links and repeats the process recursively until the crawler has a complete map of the application. It will also make requests to additional paths that commonly exist even if it doesn't see links to them - such as /admin (or /wp-admin looking for WordPress sites).
The resulting map of the application (known as the Mapped Attack Surface) is then passed on to the next stage of the process where active scanning takes place.
Note: For more information on the Mapped Attack Surface, see How can I see a list of which paths or URLs AppCheck has scanned (crawled and attacked) for my web application?
Seeded targets are URLs that are explicitly added to the map (the scanner's list of potential attack points within an application) before crawling, to ensure that the given paths and query strings (and anything else found by continuing to crawl from them) are included in the scan. Often they would be found anyway during the crawl, but adding them as Seeded Targets just makes sure they're not missed for any reason.
This is generally only needed when a given URL can't be found by crawling from the application root (ie there's no link to it from the rest of the application), or when a URL may be incorrectly removed from the map during de-duplication.
The crucial point to be aware of is that the root of an application is assumed to be / even if a path is specified in the URL in the Targets box.
For example, if you add the following URL to your Targets box:
then the scan target is treated as https://example.com/ while /login simply gets treated as a seeded target (ie it is added to the map of the application, and we crawl from there too). This is because people often enter the URL of the login page as the target, but they don't actually just want to scan the login page, they want to scan the whole application.
This means if you add two URLs:
then what you actually have is two identical scan targets:
and one seeded target:
meaning you scan the whole application (https://example.com/) twice.
What Should You Do If You Want To Specify Several Paths Within One Application?
Add the root of the application to the Targets box and add the extra URLs to the Seeded Targets box.
For example, add
as the target, and add
as seeded targets. This means we will scan all of https://example.com/ and we will be sure to include /login and /my-application.
What Should You Do If You Do Not Want To Scan Everything Under / ?
If you want to specify a root other than / then you can do so using a pipe character (|) after the URL in the Targets box, eg:
In this case the scan target is treated as
with /my-application/ as the root, and nothing outside that path, such as https://example.com/ or https://example.com/login, will be scanned.
What Should You Do If You Want To Explicitly Exclude A Specific URL From A Scan?
Add the URL you wish to exclude to the scan's Blacklist Targets, which can be found just bellow the Targets box near the top of the scan configuration page.
Any URL which begins with a blacklisted URL will be excluded from the scan. In the above example the blacklist contains https://example.com/do-not-scan. This means https://example.com/do-not-scan/secret-child-page will also not be scanned, and so on.