
Researchers at Virginia Tech have developed a new tool, ASINT, designed to improve how Autonomous System Numbers (ASNs) are mapped to their operating organizations.
ASINT combines routing metadata (whois, PeeringDB, corporate websites, merger and acquisition records, and more) with analysis pipelines to group ASNs under their parent organizations. Conceptually, it is similar to CAIDA’s AS2ORG dataset, but with broader coverage and a stronger focus on operational use cases.
How ASINT Works
Briefly, ASINT follows this pipeline:
- Data collection: We ingest whois data from all five Regional Internet Registries (RIRs), CAIDA AS2Org, and PeeringDB, then augment this data with targeted web crawls of official company sites, Wikipedia, and news to capture aliases, rebrands, and parent-subsidiary ties that registries miss.
- Build an org-centric knowledge base: We normalize names, run named entity recognition to pull candidate organization mentions, and keep only text chunks that co-mention the target org and a candidate. These chunks go into a vector database to support fast, high-precision retrieval.
- Retrieval-augmented inference: For each org pair, we retrieve only the relevant snippets and ask a Large-Language Model (LLM) to classify the relationship as one of: Alias, parent-child, or no relation. Constraining the model to retrieve evidence both lowers cost and reduces hallucinations.
- Post-filtering and clustering: We merge records that share aliases, detect rebrands through majority-voted secondary aliases, and then build a directed acyclic graph of parent-child edges. The result is an ‘organization family’ that unifies ASNs owned or operated by the same entity, including subsidiaries.
At the current snapshot, the pipeline maps 111,470 ASNs into 81,233 organization families and operates at web scale by pushing heavy text down to a small, curated context before inference.
Why this matters
Correctly attributing ASNs to their real operators is not only important for Internet research but also for operations. For example, systems such as Cloudflare Radar regularly flag anomalies when the origin ASN does not match the relevant Route Origin Authorizations (ROAs). Many of these turn out not to be hijacks, but legitimate internal re-announcements between sibling ASNs.
Between January 2023 and July 2024, ASINT analysed 17,282 Border Gateway Protocol (BGP) anomaly alerts and identified 1,621 (~9.4%) as likely intra-organization. A sample of 100 of these cases was verified with operators; all 32 who responded confirmed they were internal announcements rather than hijacks.
Try it out
ASINT is available to the community and can be searched in two ways:
- ASN search example: AS18733 — Microsoft and acquired ASNs
- Organization search example: Deloitte across multiple RIRs
Community input needed
The team is seeking operator feedback to improve accuracy:
- Check your organization and ASNs. If mappings look correct, click the thumbs-up. If not, click the thumbs-down and leave a short comment. Useful details include legal entity names, subsidiaries, brand versus legal naming, mergers and acquisitions, or other business structures (such as resellers, government agencies, or multi-tenant network operations centres).
- Report errors. Two common issues are false positives (different organizations grouped) and misses (sibling ASNs not linked). Comments will help refine both the data and the extraction pipeline — we will feed your comments into the next LLM pipeline cycle to correct mappings and tune extraction rules.
- Share broader feedback. This includes additional data sources, recurring edge cases, balance between over- and under-merging, or interface issues.
Feedback can be given directly on the site, or by email to the team: tijay@vt.edu, yongzhe@vt.edu, weitongli@vt.edu.
Looking ahead
ASINT is intended to complement, not replace, existing datasets and community curation efforts. The researchers plan to report back on corrections learned from operator feedback and continue iterating the tool.
Operators who maintain ground-truth lists for their own networks are especially encouraged to share pointers, as this will strengthen the dataset for everyone.
Taejoong (Tijay) Chung is an Associate Professor at Virginia Tech interested in Internet measurement and security.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.