Known Bots
Dashboard showing known, verified, and unverified self-declared bot traffic
What is this dashboard for…
This dashboard provides an overview of all traffic from both verified & unverified self-declared bots. These are bots who state who they are via their User Agent, vs using a generic Chrome or similar browser UA.
Verified bots are those that are popular good bots, and have been verified by Netacea to check that the bots are who they claim to be. Unverified bots are those where the checks have not (yet) been made, so that UAs could be spoofed, or they could be legit. The constant proliferation of AI. LLM and similar tools means that this list is constantly growing and evolving, so the verification process is similarly iterative.
This will help you to easily identify…
Traffic levels seen from known bots, and how many (if any) are being mitigated
Information about the stated purpose of these bots - their UAs, the vendor they represent, the family of tools they belong to, and so on
If any of the known bots are making too many requests to your site
Effects of applying new tooling or marketing changes on your known bots traffic levels
Whether the known bot is on your robots.txt disallow list, and the ability to infer whether it is obeying these instructions or not
If a known bot is present on your site that is NOT a tool that is used by your team. This could either be an error by the provider, or it could be an indicator of an attacker deploying a recognisable tool to obfuscate their presence on your site
Dimensions you can filter on…
Date Ranges
Bot Vendor
Bot Name
User Agent
All known bots, verified only, unverified only
Controls
Controls are a great way of filtering the report to highlight and segment only the information you need to see.
These can be managed by:
Using the Date Range Picker to filter by time
Datastream selection controls to filter by the domain or endpoint that you'd like to investigate
Clicking on the report itself to select and segment, either:
within the table of Bot Vendors, or
in the User Agent breakdown
Once a filter has been applied, the dashboard and tables will update automatically to reflect this.
Obeying robots.txt
The data in the Known Bots breakdown table is automatically sorted by volume. This means that unless your robots.txt instructions are extremely comprehensive, and include some of the most 'active' vendors (Google, Bing and so on), that the data relevant to your robots.txt list is unlikely to appear at the top of the table. Fortunately, surfacing this data is very straightforward.
In the table, click on the 'On robots.txt' column heading, and change the sort order from 'Requests' to 'On robots.txt', as per the image below.
You will then see the UAs matching the instructions in the robots.txt instructions at the top of the table, allowing you to see how many requests these bots are making, and - importantly - how many paths they are hitting. If the number of paths is greater than one, you can infer that they are disobeying the instructions in your listing.
Report Exports
To export any of the data in the dashboard, hover over one of the tables or plots, click on the ellipsis in the top right-hand corner, and select Export to CSV or Excel.
Last updated