transmissionbt.com (A bittorrent client for macOS) is out ranking youtube, wikipedia, github, etc. Is transmission that popular? I assume its the auto-updater? Seems insane.
My guess is that DNS caching in web browsers prevents repeated lookup requests where as maybe the transmission implementation has no caching and does a lookup every time.
As the comments here suggest, this list may be more indicative of some developer-introduced application behaviour, e.g., gratuitous DNS lookups, than "popularity".
In addition to Tranco, I maintain regularly updated lists of the top one million domains from sources like Cisco, Majestic, BuiltWith, Statvoo, DomCop, and Cloudflare. Feel free to check it out: https://github.com/PeterDaveHello/top-1m-domains
Or they query the DNS very often. Most devices have DNS caching, so if things like tiktok.com end up there, there must be a loot of devices (also, a lot of subdomains, which aren't visible in these lists).
Are there host lists for pihole/adguard/ublock for these kinds of domains?
I'd assume the domains change regularly if it's malware or bot networks, but because they rank so high in this list, it sounds like it should be feasible to keep a blocklist somewhat up to date.
It could also be ad networks; create random domains and subdomains so that simple domain blocklists are difficult to keep up to date efficiently (or at least, so that constant maintenance is required).
It could be a good pattern for spam/ads organizations, changing the random domain name as soon as traffic drops because the actual ones ended in enough blocklists.
It's quite interesting to me that ChatGPT is in the 200s and 300s.
By almost every metric this is one of the 10 busiest websites, and some sources are already putting it in the top 5.
Are they just disproportionately not using Quad9?
I understand that there's a lot of overlap with Google having several spots in the top 50 itself, several being infrastructure like cloudflare and akamai, and several others being malware - but it still seems surprising.
It's just kind of shocking to see Slack, Zoom, LinkedIn, and even DropBox, Roku, and Yandex much higher up.
Something else to factor in is the TTL of both NS/A types for each apex domain and the individual records including sub-domains. Clients will not be querying Quad9 until the TTL expires on their clients. TTL would have to be factored into query rates to determine popularity correctly whereas these lists just show raw query numbers.
For example, there are many records under amazonaws.com that have 5 second TTL's mostly EC2 instances. As such clients will query them at a much higher rate whereas grammarly.io have a number of records with a 900 second TTL. This will skew the ranking positions of the two apex domains. I suppose if one wanted to game this they could have an A record to a non-critical part of a site that is not visibly rendered by the end-user and has a TTL of 1 second assuming quad9 is not rewrite min/max-ttl which some resolvers do.
Examples of just some of the TTL's used on these apex domains excluding individual records:
Some examples of rewriting max-ttl I forgot which ones rewrite min-ttl:
for Resolver in 1.1.1.1 8.8.8.8 9.9.9.9 216.128.176.142;do echo -en "${Resolver}:\t"; dig @${Resolver} +nocookie +noall +answer -t a big.ohcdn.net;done | column -t
1.1.1.1: big.ohcdn.net. 3628800 IN A 227.227.227.227
8.8.8.8: big.ohcdn.net. 21422 IN A 227.227.227.227
9.9.9.9: big.ohcdn.net. 43200 IN A 227.227.227.227
216.128.176.142: big.ohcdn.net. 3628800 IN A 227.227.227.227 # authoritative server
[Edit] I just realized they made a general statement to this effect in the git repo.
My theory: the domains you name have ad beacons, desktop apps that are persistently running, and/or physical devices plugged into networks out there. Whereas ChatGPT is used (domainwise) overwhelmingly by humans hitting the site in their browsers.
I was personally going to be surprised. Bots and machines categorically do not peruse such material, and DNS traffic is largely not going to have a human on the other end.
I’m not entirely sure what it is, but my Alexa devices hit subdomains within it very frequently based on my local DNS history. That’s probably why it made the top of the list.
I don't see how it would be possible to produce this table under Quad9's privacy policy. Nothing in their privacy policy says that they maintain logs that would enable them to count queries by label. Can anyone explain?
It does say that they collect this information in their “Data and Privacy Policy”. Specifically section 2.2 (Data Collected): https://quad9.net/privacy/policy/
Which policy are you referring to that implies they don’t?
Also I think you are assuming they store query logs and then aggregate this data later. It is much simpler just to maintain an integer counter for monitoring as the queries come in, and ingest that into a time series database (not sure if that’s what they actually do). Maybe it needs to be a bit fancier to handle the cardinality of DNS names dimension, but re-constructing this from logs would be much more expensive.
The section you mentioned does not say anything about having counters for labels. It only mentions that they record "[t]he times of the first and most recent instances of queries for each query label".
Well, the counters aren't data collected, they are data derived from the data they do collect. The privacy policy covers collection.
EDIT: I see they went out of their way to say "this is the complete list of everything we count" and they did not include counters by label, so I see your point!
I don't see how that is compatible with 2.2. They don't say anything about counters per label. It says counter per RR type, and watermarks of least and most recent timestamps by label, not count by label.
If an organization is going to be this specific about what they count, it implies that this is everything they count, not that there may also be other junk unmentioned.
I took a look at their privacy policy and agree that it doesn't specifically list that it logs which domains are being queried. It does list a bunch of things it does log as counters, all of which seems reasonable, but they don't explicitly say "we count which domains are being queried".
That said, I think it's entirely reasonable for them to log domains alone if they're completely disconnected from any user activity, i.e. a simple "increment the counter for foo.com" is reasonable since that's unrelated to user privacy.
Unless say, an adversary can link an obscure domain to a specific user/use case. Get that counter log and you can track a certain behavior (only pings this domain when about to do something or when on vacation, their house is empty, etc.)
One way around that is to set up a cron job that queries the most common domains one visits hourly. When requested by workstations and cell phones they will be served up by cache. At least that is what I have been doing for a few decades and works fine. I block all the DoH/DoT resolvers which is easier to do than some might think. One can do the individual A records or just the apex A/NS records to get infrastructure cache and then configure Unbound to prefetch records about to expire.
Just for fun I have added some of these into my cron job.
Huh? The average Quad9 user is probably a tech-savvy person who cares about online privacy and/or malware protection (Quad9 blocks known malicious domains).
Isn't part of the reasons to run a public DNS to sell these hard earned info for profit to marketers etc but they just release publicly? Of course this is just the tip of the iceberg of the information they gather.
Really interesting to know though.
Some just look way high up and could mean buggy implementation without proper cache usage or persistently banging the domain.
Probably some sort of command and control for a botnet.
They calculate a random domain name based on the timestamp (so it’s constantly changing every X days in case it gets seized), and have some validation to make sure commands are signed (to prevent someone name squatting to control their botnet).
Wow, that's smart. I was wondering whether there is a way for the bots to generate "unpredictable" domains such that security researchers could not predict them efficiently (even with source code), but the botnet controller can.
Time-lock puzzles come close, but but it requires that the bots have computing power comparable to the security researchers.
> Wow, that's smart. I was wondering whether there is a way for the bots to generate "unpredictable" domains such that security researchers could not predict them efficiently (even with source code), but the botnet controller can.
There is a fairly simple method which achieves the same advantage for a botnet controller.
1. Use a hash of the current day to derive, for that day, an infinite stream of domain names. This could be something as simple as `to_human_readable_domain(sha256(daily_hash + i))`.
2. A botnet slave attempts to access servers in a diagonal order over (days, domains), starting at the first domain for today and working backwards in days and forwards in domains. An image best describes what I mean by this: https://i.imgur.com/lcEbHwz.png
3. So long as one of those domains is controlled by the botnet operator (which can be verified using a signed response from the server), they can control the botnet.
This means that the botnet operator only needs to purchase one domain every couple of days to keep controlling their botnet, while someone trying to stop them will have to buy thousands and thousands every day.
And when you successfully purchase a domain you can publish the new domain to any connected slaves, so this scheme is only necessary for recruitment into the network, not continued control.
Imgur has been inaccessible for me for months, they're one of those organizations that consider it proper to block whole countries to counter bot abuse.
I've definitely heard of cnc using a plural of domains for this reason. the bots have a list of domains they reach out to, searching for one that is valid.
I believe one issue with this strategy is many corporate VPNs block fresh domains. I guess if the software was pinned to use encrypted DNS instead of whatever the OS recommends, then the DNS blocking could be avoided...
My employer uses Zscaler. I don't know exactly how they implement this, but my educated guess is the corporate DNS server doesn't resolve domains that were created recently.
In technical terms, the device asks the private corporate DNS server for the IP address of the hostname. The private DNS server checks the requested domain against a threat intelligence feed that tracks domain registration dates (and security risks). If the domain is deemed a threat, either return an IP address which points at a server that shows a warning message (if http traffic) or return an invalid IP (0.0.0.0).
there are tools pretty good at detecting DGAs these days, but not often implemented.
the best thing to do afaik is use services normal user shave access to, and communicate via those. its hard to tell for anyone who's extracting the data from the third party so the server is hidden. (e.g bot posts images to twitter, and server scrapes the images from twitter, this is also already old news but easier and more likely to sail through that next gen firewall -_-)
i'd say having ur 'own' servers and domains is maybe even a bit dated ( though sadly still very effective!)
It's one of many possible strategies. Any one strategy can be blocked if it's used by enough malicious actors (e.g. Twitter can be forced to block base64 tweets); if they all use different strategies, it becomes harder to justify blocking each individual one.
If I’m remembering correctly, Conficker was the first major use of this technique. They used a relatively small domain pool (250) so the registries were able to lock them up preemptively.
I remember a couple legitimate sites getting slammed by accidental DDOS because the algorithm happened to generate their domain, but having a hard time finding a reference to that.
That might work for the current generation of bots, but it will become infeasible when the domain names are generated in such a way that they overlap with spellable and existing domain names.
> Boulder is the home of scientific laboratories for the U. S. Department of Commerce’s NOAA, NIST and NTIA. Clustered on the foothills of the Rocky Mountains in Boulder Colorado, these labs are the home of scientific research and engineering in the fields of electromagnetics, materials reliability, optoelectronics, quantum electronics and physics, time and frequency, earth systems, weather and telecommunications.
Looks like a place full of scientific knowledge. I hope they haven't suffered much DOGEing.
Most likely something like an ad service to prevent their content being caught by domain blocklists. That would be similar to how a lot of websites started using randomized strings for attributes like id and class so that users couldn't block page elements based on CSS selectors.
https://github.com/Quad9DNS/quad9-domains-top500/blob/dfd513...
https://raw.githubusercontent.com/Quad9DNS/quad9-domains-top...
There’s a bunch of random looking domain names: cmidphnvq.com, rpqihexdb.com, facebook.com. I’d guess they for advertising?
I'd assume the domains change regularly if it's malware or bot networks, but because they rank so high in this list, it sounds like it should be feasible to keep a blocklist somewhat up to date.
Some of these lists are already in uBO out of the box.
so does router.blockdh100c.co
By almost every metric this is one of the 10 busiest websites, and some sources are already putting it in the top 5.
Are they just disproportionately not using Quad9?
I understand that there's a lot of overlap with Google having several spots in the top 50 itself, several being infrastructure like cloudflare and akamai, and several others being malware - but it still seems surprising.
It's just kind of shocking to see Slack, Zoom, LinkedIn, and even DropBox, Roku, and Yandex much higher up.
For example, there are many records under amazonaws.com that have 5 second TTL's mostly EC2 instances. As such clients will query them at a much higher rate whereas grammarly.io have a number of records with a 900 second TTL. This will skew the ranking positions of the two apex domains. I suppose if one wanted to game this they could have an A record to a non-critical part of a site that is not visibly rendered by the end-user and has a TTL of 1 second assuming quad9 is not rewrite min/max-ttl which some resolvers do.
Examples of just some of the TTL's used on these apex domains excluding individual records:
Some examples of rewriting max-ttl I forgot which ones rewrite min-ttl: [Edit] I just realized they made a general statement to this effect in the git repo.Some of those have many trackers and background sub domains that add up.
For example, Linkedin their most popular sub domain is: px.ads.linkedin.com
Here is a more comprehensive list with top 10k domains (including sub domains):
https://dnsarchive.net/top-domains?rank=top10k
{"position": 127, "domain_name": "amazon.dev", "date": "2025-07-10"}
Source: https://github.com/Quad9DNS/quad9-domains-top500/blob/main/t...
Looks like their customer support rep portal. Presumably there are not A/CNAME records at the top level, but na.headphones.whs.amazon.dev resolves.
54.in-addr.arpa looks to be Amazon's range and there are several others.
Edit: I've found that sometimes they're pretty poor at caching responses so you end up with a lot of these requests.
Which policy are you referring to that implies they don’t?
Also I think you are assuming they store query logs and then aggregate this data later. It is much simpler just to maintain an integer counter for monitoring as the queries come in, and ingest that into a time series database (not sure if that’s what they actually do). Maybe it needs to be a bit fancier to handle the cardinality of DNS names dimension, but re-constructing this from logs would be much more expensive.
EDIT: I see they went out of their way to say "this is the complete list of everything we count" and they did not include counters by label, so I see your point!
If an organization is going to be this specific about what they count, it implies that this is everything they count, not that there may also be other junk unmentioned.
That said, I think it's entirely reasonable for them to log domains alone if they're completely disconnected from any user activity, i.e. a simple "increment the counter for foo.com" is reasonable since that's unrelated to user privacy.
Just for fun I have added some of these into my cron job.
Really interesting to know though.
Some just look way high up and could mean buggy implementation without proper cache usage or persistently banging the domain.
{"position": 5, "domain_name": "kxulsrwcq.com", "date": "2025-07-10"}
What the
https://www.ipaddress.com/website/kxulsrwcq.com/
> Safety/Trust: Unknown
They calculate a random domain name based on the timestamp (so it’s constantly changing every X days in case it gets seized), and have some validation to make sure commands are signed (to prevent someone name squatting to control their botnet).
Time-lock puzzles come close, but but it requires that the bots have computing power comparable to the security researchers.
There is a fairly simple method which achieves the same advantage for a botnet controller.
1. Use a hash of the current day to derive, for that day, an infinite stream of domain names. This could be something as simple as `to_human_readable_domain(sha256(daily_hash + i))`.
2. A botnet slave attempts to access servers in a diagonal order over (days, domains), starting at the first domain for today and working backwards in days and forwards in domains. An image best describes what I mean by this: https://i.imgur.com/lcEbHwz.png
3. So long as one of those domains is controlled by the botnet operator (which can be verified using a signed response from the server), they can control the botnet.
This means that the botnet operator only needs to purchase one domain every couple of days to keep controlling their botnet, while someone trying to stop them will have to buy thousands and thousands every day.
And when you successfully purchase a domain you can publish the new domain to any connected slaves, so this scheme is only necessary for recruitment into the network, not continued control.
https://files.catbox.moe/gilmd1.png
Imgur has been inaccessible for me for months, they're one of those organizations that consider it proper to block whole countries to counter bot abuse.
I believe one issue with this strategy is many corporate VPNs block fresh domains. I guess if the software was pinned to use encrypted DNS instead of whatever the OS recommends, then the DNS blocking could be avoided...
In technical terms, the device asks the private corporate DNS server for the IP address of the hostname. The private DNS server checks the requested domain against a threat intelligence feed that tracks domain registration dates (and security risks). If the domain is deemed a threat, either return an IP address which points at a server that shows a warning message (if http traffic) or return an invalid IP (0.0.0.0).
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?...
When getting a query for a domain you have not heard about, query whois for it. Store it's registration date in the cache.
the best thing to do afaik is use services normal user shave access to, and communicate via those. its hard to tell for anyone who's extracting the data from the third party so the server is hidden. (e.g bot posts images to twitter, and server scrapes the images from twitter, this is also already old news but easier and more likely to sail through that next gen firewall -_-)
i'd say having ur 'own' servers and domains is maybe even a bit dated ( though sadly still very effective!)
I remember a couple legitimate sites getting slammed by accidental DDOS because the algorithm happened to generate their domain, but having a hard time finding a reference to that.
https://en.m.wikipedia.org/wiki/Conficker
https://quad9.net/service/threat-blocking/
Each time you resolve, the resulting IP can be part of the hash for predicting a future hostname.
{"position": 26, "domain_name": "cmidphnvq.com", "date": "2025-07-10"}
{"position": 28, "domain_name": "xmqkychtb.com", "date": "2025-07-10"}
{"position": 37, "domain_name": "ezdrtpvsa.com", "date": "2025-07-10"}
{"position": 38, "domain_name": "wvdbozpfc.com", "date": "2025-07-10"}
{"position": 46, "domain_name": "bldrdoc.gov", "date": "2025-07-10"}
{"position": 52, "domain_name": "gadf99632rm.xyz", "date": "2025-07-10"}
Geniuses...
I added it in the first place as it was a non-resolving .gov in the top 50 list which seemed out of place to me.
> bldrdoc.gov: No address associated with hostname
I see that the time related subdomains in your link do resolve to the nist.gov timeserver.
But I really am wondering what's up with all of the rest of these domains.
More googling gave me https://www.boulder.doc.gov
> Boulder is the home of scientific laboratories for the U. S. Department of Commerce’s NOAA, NIST and NTIA. Clustered on the foothills of the Rocky Mountains in Boulder Colorado, these labs are the home of scientific research and engineering in the fields of electromagnetics, materials reliability, optoelectronics, quantum electronics and physics, time and frequency, earth systems, weather and telecommunications.
Looks like a place full of scientific knowledge. I hope they haven't suffered much DOGEing.
Ex:
https://dnsarchive.net/search?q=cmidphnvq.com
https://dnsarchive.net/search?q=xmqkychtb
https://dnsarchive.net/ipv4/34.126.227.30
https://radar.cloudflare.com/domains/domain/kxulsrwcq.com