APNIC’s cloud and interconnection strategy

By on 14 Jun 2023

Category: Tech matters

Tags: ,

Blog home

APNIC banner

Well-connected, resilient, and low-latency connectivity is a key component of APNIC’s current and future operations. APNIC continues to deliver services via cloud providers. The different cloud-based approaches for APNIC’s online services are described below.

APNIC uses a range of approaches across different cloud providers for its publicly visible services and core internal processes. This is to improve availability, reduce downtime and service outages, and deliver service as effectively as possible.

APNIC maintains connections to major Internet hubs in the Asia Pacific region from its on-premises locations that are reachable via multiple transit and peering links.

APNIC welcomes feedback from the community on its cloud approach.

High-level strategy

In 2021, APNIC developed a high-level cloud and interconnection strategy to guide future work, and it was reviewed in early 2023.

Overall goals

  • Resilience, availability, integrity, and low latency for all APNIC products and services
  • Ensure the security of services and data
  • Minimize vendor lock-in

Cloud, network, and interconnection strategy

  • Maintain a well-connected core network in Brisbane for office on-premises connectivity
  • Maintain critical data on-premises for resilience and data security
  • Maintain remote networks in strategic locations in the region to host APNIC services
  • Maintain resilient and low latency connectivity to cloud providers hosting critical APNIC services and key Internet hubs in the Asia Pacific region
  • Where availability targets require, maintain a hybrid cloud model for critical services as a combination of on-premises private cloud, plus public cloud where appropriate
  • Where required by availability targets, use of multiple cloud providers
  • Encourage and support peering for resilient and low latency connectivity to APNIC services wherever they are delivered

This allows APNIC to benefit from:

  • Many global locations, placing delivery of products close to users, and providing a better user experience
  • Fast scaling capability to meet dynamic demand and manage budget
  • Managed services such as databases and object storage
  • Distributed Denial-of-Service (DDoS) protection, caching, and resilient back-haul networks for client-to-server communication
  • Reduced maintenance
  • API-driven with well-supported automation tooling for improved software development and deployment

Use of cloud service providers

To ensure the availability of all the registry services, APNIC uses a range of cloud providers depending on service requirements and other factors. APNIC has selected more than one independently routed and operated cloud provider to benefit from diversity of routing and avoid single models and single points of failure (noting that in some instances these are simply unavoidable).

Diagram of APNIC’s cloud strategy
Figure 1 — Diagram of APNIC’s cloud strategy: Fronted by Cloudflare, backed by Kubernetes, on-premises and in GCP.

By designing most new services to deploy to Kubernetes, APNIC benefits from its built-in self-healing properties, auto-scaling (increase to meet demand) and a well-supported deployment platform across both on-premises and the large cloud providers, minimizing vendor lock-in.

Where service delivery demands a single point of authority, and for older services in maintenance, the use of Virtual Machines (VMs) is considered acceptable, giving APNIC both ‘warm’ and ‘cold’ standby options to switch out a VM where necessary.

Use of Cloudflare

Where an HTTPS web service is exposed, APNIC uses Cloudflare as the primary agent to serve these pages from the internal authoritative web servers. A web service like www.apnic.net or blog.apnic.net is implemented through a name sub-delegated as a CNAME to Cloudflare. This offers high availability (at Cloudflare’s Service-Level Agreement) with localized service delivery in all regions Cloudflare has a cache. Cloudflare can automatically inform APNIC of regional outages, helps manage DDoS threats, and reduces the load on internally managed systems.

Use of Google Cloud Platform and on-premises Kubernetes

Where possible, the web-based services behind these Cloudflare public DNS caches are delivered from a Kubernetes platform, running either in Google Cloud Platform (GCP) or from APNIC’s on-premises location. Diverse Kubernetes back-ends were deliberately chosen to ensure APNIC has the ability to relocate service if there was a catastrophic failure of either the on-premises location or a cloud provider and to avoid ‘capture’ behind a single Border Gateway Protocol (BGP) Autonomous System (AS) for delivery of these services.

Where a service is designed to perform updates to the registry (for example, my.apnic.net resource management) it is provided through on-premises Kubernetes or Virtual Machines.

Whois implementation as a ‘unicast’ cloud service with redirection

To provide the port-43 Whois service, a small number of VMs operate on multiple providers and use GeoDNS to manage server selection and load distribution worldwide and are redundant in the face of a local failure. These Whois nodes replicate data from APNIC’s on-premises VMs.

Reverse DNS implementation as an ‘anycast’ cloud of authoritative nameservers

APNIC operates a global system of DNS nameservers, in conjunction with the other Regional Internet Registries (RIR) and with commercially backed DNS anycast service providers (Netnod) to ensure wide availability of the DNS delegations about the resources under management.

Diagram of APNIC's anycast DNS service for the apnic.net domain names we operate.
Figure 2 — Diagram of the anycast DNS service for the apnic.net domain names APNIC operates.

Operations functionality in the cloud

APNIC makes use of cloud systems for most internal functions.

  • Payroll, billing, and Member management are provided through Netsuite, Expensify, and Salesforce
  • Office 365 for email, document creation and storage
  • Software development is managed in Jira, with both on-premises and GCP-hosted clusters

Systems monitoring across all dependencies

APNIC monitors these elements using Sensu for probing and alerting, Prometheus for metrics collection, and Grafana for display. External services are used to continuously monitor service availability from outside APNIC’s own routing architecture to ensure any loss of service which may not be visible to us on the ‘inside’ can be seen.

Future activity

Software development using continuous integration and deployment in Kubernetes provides a mechanism that can be ‘applied’ to a wide variety of cloud providers. APNIC is therefore most likely to continue this approach to systems development as relationships with service providers are reviewed.

As existing systems dependencies are reviewed, changes inside this basic methodology to adopt new cloud providers, deploy services to diverse Kubernetes platforms and scale them in the light of service demand are under consideration.

With access to cloud deployment, some system architectural decisions taken when APNIC deployed predominantly to VMs on-premises can be reconsidered. Potential reductions in deployment costs for services such as Registration Data Access Protocol (RDAP)are being explored.

APNIC welcomes feedback from the community on its cloud approach. The Secretariat has begun community consultation on its critical service availability project and welcomes input from all interested Members .

Registry-specific services APNIC provides

Whois and RDAP information services

The port-43 ‘Whois’ service, and the related RDAP service, document the status of resources that have been delegated and registered to specific entities. This information includes Internet Routing Registry (IRR) functions which are used to assist in the production of BGP configuration and information about the resource holders, to help debug problems with global routing, aid law enforcement, and for general community access to find ‘who has’ the resources.

Domain Name Service (DNS) authoritative servers

Reverse DNS is a subset of the domain name service function designed to map Internet addresses to an assigned name. Most DNS usage is mapping from names to addresses; Reverse DNS is the inverse function. In concert with ICANN/IANA and the other RIRs, APNIC operates DNS nameservers to perform this function.

RPKI servers and associated Hardware Security Modules (HSM)

RPKI is a framework for public-private key (PKI) certificates and signed objects which relate to Internet Number Resources. APNIC operates one of five ‘trust anchors’, which is a highly secure PKI service to anchor the validation of these certificates and signed objects. From this, APNIC delegates and operates RPKI services and the associated public repositories: Stores of all the cryptographic materials made by APNIC, and by resource holders.

Additional public and internal services

Alongside these services which directly implement core functions of a regional Internet registry, APNIC operates other services which implement supporting functions including membership, resource and administrative activities of the organization.

APNIC has multiple web services for Member management, resource management, information discovery, training and the regular conferences and policy development processes, along with servers designed to support the Secretariat function internally: databases, finance and asset control, document storage, and registry systems that also require software development platforms for code management, testing and release engineering. These also require APNIC to operate networks internally with switching services, and interconnections to data centres and the global Internet.

To provide all these functions, APNIC uses small delegations of public number resources for the operation of these services from the Secretariat offices and other locations. These are necessarily ‘critical dependencies’ for the provision of some of APNIC’s core services. The use of public number resources is minimized, and most of APNIC’s back office is numbered using private (RFC 1918) address space.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *