How to: Edge router and BNG optimization

By on 24 Jun 2021

Category: Tech matters

Tags: , , ,

Blog home

NOTE: The author has updated the information in this post. See the author’s blog post for the latest information.

Many ISPs in the Asia Pacific region use MikroTik RouterOS to provide access to their customers via PPPoE (please get on board with IPv6!), and some use MikroTik for their edge/core routers as well. This post will walk through some issues and solutions for those ISPs.

MikroTik uses RouterOS so this guide will be based on MikroTik RouterOS syntax but it shouldn’t be too hard to replicate the same configuration on other platforms. While RouterOS is based on the Linux Kernel, RouterOS v6 stable/LTS runs on an ancient version using legacy iptables for packet filtering. Hopefully, they will implement newer frameworks for their RouterOS v7, but if you want to thoroughly understand the logic flow behind these suggestions and rules, it’s worth getting familiar with Linux Kernel documentation on the web.

This guide is meant for network engineers/ISPs so I will assume the reader has some basic knowledge of the terminology and technologies/protocols used in typical BNG/CGNAT configuration. From here onwards, core and edge are synonymous, and BNG and access router are synonymous, for the sake of simplicity.

Let’s get started.

General configuration change

I was surprised to find a few (ideally, it should be zero) networks not implementing basic security features on their routers. Some networks are still using Telnet (in this day and age!), exposing neighbor discovery and similar, and running on outdated RouterOS and firmware. I strongly recommend readers make use of MikroTik’s basic security guide and the measures below:

  • Upgrade RouterOS to the latest long-term release
  • Upgrade the firmware after OS
  • Make use of Reverse Path Filtering (slightly different from Unicast Reverse Path Forwarding) that is found in IP>Settings>rp-filter
    • Use loose mode on edge router and/or wherever asymmetric or policy routing takes place (always use loose mode on edge routers or wherever AS termination takes place)
    • Use strict mode on BNG and/or wherever symmetric routing take place
  • Make use of interface lists inside Interfaces>Interface List on all routers. Use WAN for all public interface/outgoing and LAN for all local interfaces/customer-facing side, remembering to include dynamic interfaces for LAN on BNG to account for all PPPoE users
Figure 1 — Configuration changes for interface lists.
Figure 1 — Configuration changes for interface lists.
  • Disable connection tracking on the edge router with /ip firewall connection tracking set enabled=no
  • Enable loose TCP tracking on all routers including BNG with /ip firewall connection tracking loose-tcp-tracking=yes
  • Use the connection_tracking timeout values shown in Figure 2 on all routers
Figure 2 — Recommended connection tracking timeout values.
Figure 2 — Recommended connection tracking timeout values.

We saw real-time improvements to stability and performance especially for UDP traffic such as VoIP, gaming, P2P UDP NAT punching, and more, by using the values in Figure 2. You may need to upgrade the RAM to accommodate these values.

BNG

There are some issues with PPPoE. We saw packet fragmentation due to non-standard 1500 MTU/MRU. Typically ISPs use 1492 or 1480 or another strange MTU size, so both the BNG device and customer router need to make use of hacks like TCP MSS Clamping to work around this. PMTUD is simply unreliable as per RFC 8900 and it gets worse with CGNAT.

To solve these issues we deploy RFC 4638 and simply set MTU and MRU to 1500, inside the PPPoE Server on the BNG. Then, Layer 3’s actual MTU must be set to 1520 such as: /interface ethernet set [ find default-name=ether1_for_customer_delegation ] mtu=1520. L2 MTU must be high enough to cover any other protocols such as VLAN. Typically 1598 (default) on MikroTik is sufficient. Using 1520 for L3 MTU as the value is a strange MikroTik quirk. 1508 for L3 MTU should work fine on routers from Cisco, Juniper, and so on.

These settings are shown in Figures 3 and 4 and explained in the steps below.

Figure 3 — PPPoE server MTU/MRU and TCP MSS clamping configuration.
Figure 3 — PPPoE server MTU/MRU and TCP MSS clamping configuration.
Figure 4 — Ethernet interface configuration for PPPoE delegation.
Figure 4 — Ethernet interface configuration for PPPoE delegation.

Disable the TCP MSS Clamping rules inside IP>Mangle and make use of PPP>Profile>Default* to enable TCP MSS Clamping directly on the PPPoE engine. This will do the work for any customer whose MTU/MRU is less than 1500.

On the customer side, not all routers can take advantage of RFC 4638, such as TP-Link, Tenda and more. For them, MTU must remain capped at 1492. The 1492 limitation on their end won’t cause issues with packet fragmentation as packets would fragment at the source (their routers) before it exits the interface and hits the BNG; TCP Clamping on PPPoE engine takes care of anything coming in from the outside world towards the customer.

I have observed 1500 MRU when pinging from the outside world, suggesting some of these consumer routers support 1500 MRU. If they are using MikroTik, pfSense, VyOS, or similar, they can take advantage of RFC 4638, aka 1500 MTU/MRU, for their PPPoE client. Note that some ONT/ONU devices have strange behaviours for MTU negotiation; only a few brands like GX, TP-Link, and Huawei have been found to be flawless.

For PPPoE, Proxy-ARP should be enabled on the Ethernet interfaces used for serving customers. You can do that via Interfaces>Whateverinterface>ARP=proxy-arp. When disabled, for whatever reason, if you ping a PPPoE customer from the BNG or another customer, it would leave the access router and then come back in through the edge router, even though static routes are manually set on the BNG itself. This leads to insanely high latency instead of sub-zero latency.

CGNAT

The issues are that the majority of ISPs are using RFC 1918 subnets for CGNAT and can clash with subnets on the customer site, breaking P2P traffic and killing the end-to-end principle. CGNAT requires proper NAT traversal for various protocols, including IPSec.

The solutions are to make use of the 100.64.0.0/10 subnet as it’s meant for CGNAT usage to prevent clashing on the customer site and enable the NAT traversal helpers on the router, as shown in Figure 5, inside IP>Firewall>Service Ports.

An image showing the NAT traversal helpers in RouterOS.
Figure 5 — NAT traversal helpers in RouterOS.

Use a simple netmap rule with IPSec passthrough, which will allow customers to initiate IPSec out-bound without issues, configured like this:

/ip firewall nat add action=netmap chain=srcnat comment="CGNAT rule" dst-address-list=!local ipsec-policy=out,none out-interface-list=WAN src-address-list=local to-addresses=public/25

Where “local” is the address list containing CGNAT subnets.

dst-address-list=!local is self-explanatory: anything destined towards CGNAT subnets shouldn’t be NAT’ed.

Customers should be able to talk to each other using their CGNAT IP. Xbox makes use of this and is mentioned in RFC 7021. This is like the old days with everyone having a public IP and is therefore reachable.

We tried with src nat as action but it resulted in the NAT’ed public IP constantly changing on the customer side and breaking things. It’s best to avoid deterministic NAT if you can. The above rule allows P2P traffic initiated from the inside to be reachable from the outside with various applications that make use of ephemeral ports/UDP NAT punching/STUN, and so on.

We were able to successfully seed the official Ubuntu Torrent behind the CGNAT with the above configuration, which can only mean one thing: P2P networking from in-bound established works!

An image showing BitTorrent seeding behind CGNAT.
Figure 6 — BitTorrent seeding behind CGNAT.

Firewall and security

The issues are that MikroTik lacks basic DDoS protection, simple bogon filtering, and basic rules such as dropping invalid traffic on the input chain.

The solutions are shown below in commented code. Here are the generic firewall rules that should be deployed on the BNG to cover basic security grounds:

#First we take care of address lists#
/ip firewall address-list

#Enter all local subnets/public subnets applicable to your ASN, use the full CIDR notation of the public IPv4 block assigned to you to avoid missing anything out, please avoid something like /30#

add address=example_public/24 comment="LAN subnets" list=lan_subnets
add address=example_local_private/24 comment="LAN subnets" list=lan_subnets
add address=example_cgnat_subnets/24 comment="LAN subnets" list=lan_subnets

#Create an address list containing all CGNAT subnets for use as dst-address-list in the drop !dst-NATted rule on the forward chain#
add address=100.64.0.0/10 comment="CGNAT subnets" list=cgnat_customers

###Required for DDoS protection rules###
add list=ddos-attackers
add list=ddos-targets

###Bogon filtering addresses for each of the rules in RAW/Filter###
add address=0.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=172.16.0.0/12 comment=RFC6890 list=not_in_internet
add address=192.168.0.0/16 comment=RFC6890 list=not_in_internet
add address=10.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=169.254.0.0/16 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=224.0.0.0/4 comment=Multicast list=not_in_internet
add address=198.18.0.0/15 comment=RFC6890 list=not_in_internet
add address=192.0.0.0/24 comment=RFC6890 list=not_in_internet
add address=192.0.2.0/24 comment=RFC6890 list=not_in_internet
add address=198.51.100.0/24 comment=RFC6890 list=not_in_internet
add address=203.0.113.0/24 comment=RFC6890 list=not_in_internet
add address=100.64.0.0/10 comment=RFC6890 list=not_in_internet
add address=240.0.0.0/4 comment=RFC6890 list=not_in_internet
add address=192.88.99.0/24 comment="6to4 relay Anycast [RFC 3068]" list=not_in_internet
add address=255.255.255.255 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="RAW Filtering - RFC6890 reserved" list=bad_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="RAW Filtering - RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_dst_ipv4

/ip firewall filter
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid

#Example to allow Winbox from all interfaces LAN/WAN#
add action=accept chain=input comment="Accept Winbox TCP" dst-port=9943 protocol=tcp

add action=drop chain=input comment="defconf: drop all not coming from LAN's interface list/subnets" in-interface-list=!LAN

add action=accept chain=forward comment="allow already established connections" connection-state=established,related,untracked

#Would work well only if symmetric routing occurs on the BNG. Else it could block legitmate traffic#
add action=drop disabled=yes chain=forward comment="drop invalid connections" connection-state=invalid

add action=jump chain=forward comment="Jump to DDoS detection" connection-state=new in-interface-list=WAN jump-target=detect-ddos
add action=return chain=detect-ddos dst-limit=50,50,src-and-dst-addresses/10s
add action=return chain=detect-ddos dst-limit=50,50,src-and-dst-addresses/10s protocol=tcp tcp-flags=syn,ack
add action=add-dst-to-address-list address-list=ddos-targets address-list-timeout=10m chain=detect-ddos
add action=add-src-to-address-list address-list=ddos-attackers address-list-timeout=10m chain=detect-ddos

add action=drop chain=forward comment="Drop tries to reach not public addresses from LAN" dst-address-list=not_in_internet in-interface-list=LAN out-interface-list=WAN

add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed" connection-nat-state=!dstnat connection-state=new dst-address-list=cgnat_customers in-interface-list=WAN

/ip firewall raw
add action=drop chain=prerouting comment="Drop DDoS src and dst address list" dst-address-list=ddos-targets src-address-list=ddos-attackers

add action=accept chain=prerouting comment="Enable this rule for transparent mode" disabled=yes

#If you are using DHCP, change this to accept#
add action=drop chain=prerouting comment="defconf: Drop DHCP discover" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68

add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_in_internet

#Remember to properly enter all subnets in the lan_subnet list for both your ASN public IPv4 blocks and CGNAT/local subnets#
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address-list=!lan_subnets

add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to ICMP chain" jump-target=icmp protocol=icmp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest (requires more testing)" disabled=yes log=yes log-prefix=BlockingTest
add action=accept chain=icmp comment="defconf: echo reply" icmp-options=0:0 protocol=icmp
add action=accept chain=icmp comment="defconf: net unreachable" icmp-options=3:0 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable" icmp-options=3:1 protocol=icmp
add action=accept chain=icmp comment="defconf: protocol unreachable" icmp-options=3:2 protocol=icmp
add action=accept chain=icmp comment="defconf: port unreachable" icmp-options=3:3 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable fragmentation required" icmp-options=3:4 protocol=icmp
add action=accept chain=icmp comment="defconf: echo request" icmp-options=8:0 protocol=icmp
add action=accept chain=icmp comment="defconf: time exceeded " icmp-options=11:0-255 protocol=icmp
add action=accept chain=icmp comment="defconf: allow parameter bad" icmp-options=12:0 protocol=icmp
add action=drop chain=icmp comment="defconf: drop other icmp" protocol=icmp
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

For edge routers

The purpose of the edge router is to route as fast as possible. So with that in mind, along with the basic general changes I’ve mentioned at the beginning of this article such as:

  1. No NAT
  2. No connection tracking
  3. No fancy ‘features’ (like Hotspot, PPPoE)
  4. Use your BNG routers for any customer delegation that is required

We only need to do two things:

  1. Make use of BGP Route Filtering to discard certain things:
    • Such as dropping your own prefixes from the outside
    • Dropping default routes from all peers
    • Here’s a MikroTik presentation with more details on this subject
  2. Use the RAW table to drop remaining bogon/garbage traffic similar to the one used on the BNG
    • CPU usage stays minimal when using the RAW table
    • Absolutely nothing on the filter table
/ip firewall address-list
#Enter all local subnets/public subnets applicable to your ASN, use the full CIDR notation of the public IPv4 block assigned to you to avoid missing anything out, please avoid something like /30#

add address=example_public/24 comment="LAN subnets" list=lan_subnets
add address=example_local_private/24 comment="LAN subnets" list=lan_subnets

add address=0.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=172.16.0.0/12 comment=RFC6890 list=not_in_internet
add address=192.168.0.0/16 comment=RFC6890 list=not_in_internet
add address=10.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=169.254.0.0/16 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment=RFC6890 list=not_in_internet
add address=224.0.0.0/4 comment=Multicast list=not_in_internet
add address=198.18.0.0/15 comment=RFC6890 list=not_in_internet
add address=192.0.0.0/24 comment=RFC6890 list=not_in_internet
add address=192.0.2.0/24 comment=RFC6890 list=not_in_internet
add address=198.51.100.0/24 comment=RFC6890 list=not_in_internet
add address=203.0.113.0/24 comment=RFC6890 list=not_in_internet
add address=100.64.0.0/10 comment=RFC6890 list=not_in_internet
add address=240.0.0.0/4 comment=RFC6890 list=not_in_internet
add address=192.88.99.0/24 comment="6to4 relay Anycast [RFC 3068]" list=not_in_internet
add address=255.255.255.255 comment=RFC6890 list=not_in_internet
add address=127.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.0.0/24 comment="RAW Filtering - RFC6890" list=bad_ipv4
add address=192.0.2.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=198.51.100.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=203.0.113.0/24 comment="RAW Filtering - RFC6890 documentation" list=bad_ipv4
add address=240.0.0.0/4 comment="RAW Filtering - RFC6890 reserved" list=bad_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_src_ipv4
add address=255.255.255.255 comment="RAW Filtering - RFC6890" list=bad_src_ipv4
add address=0.0.0.0/8 comment="RAW Filtering - RFC6890" list=bad_dst_ipv4
add address=224.0.0.0/4 comment="RAW Filtering - multicast" list=bad_dst_ipv4

/ip firewall raw
add action=accept chain=prerouting comment="Enable this rule for transparent mode" disabled=yes

#If you are using DHCP, change this to accept#
add action=drop chain=prerouting comment="defconf: Drop DHCP discover on LAN" dst-address=255.255.255.255 dst-port=67 in-interface-list=LAN protocol=udp src-address=0.0.0.0 src-port=68

add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_ipv4
add action=drop chain=prerouting comment="defconf: drop bad src IPs" src-address-list=bad_src_ipv4
add action=drop chain=prerouting comment="defconf: drop bad dst IPs" dst-address-list=bad_dst_ipv4
add action=drop chain=prerouting comment="defconf: drop non global from WAN" in-interface-list=WAN src-address-list=not_in_internet
add action=drop chain=prerouting comment="defconf: drop local if not from default IP range" in-interface-list=LAN src-address-list=!lan_subnets
add action=drop chain=prerouting comment="defconf: drop bad UDP" port=0 protocol=udp
add action=jump chain=prerouting comment="defconf: jump to ICMP chain" jump-target=icmp protocol=icmp
add action=jump chain=prerouting comment="defconf: jump to TCP chain" jump-target=bad_tcp protocol=tcp
add action=accept chain=prerouting comment="defconf: accept everything else from LAN" in-interface-list=LAN
add action=accept chain=prerouting comment="defconf: accept everything else from WAN" in-interface-list=WAN
add action=drop chain=prerouting comment="defconf: drop the rest"
add action=accept chain=icmp comment="defconf: echo reply" icmp-options=0:0 protocol=icmp
add action=accept chain=icmp comment="defconf: net unreachable" icmp-options=3:0 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable" icmp-options=3:1 protocol=icmp
add action=accept chain=icmp comment="defconf: protocol unreachable" icmp-options=3:2 protocol=icmp
add action=accept chain=icmp comment="defconf: port unreachable" icmp-options=3:3 protocol=icmp
add action=accept chain=icmp comment="defconf: host unreachable fragmentation required" icmp-options=3:4 protocol=icmp
add action=accept chain=icmp comment="defconf: echo request" icmp-options=8:0 protocol=icmp
add action=accept chain=icmp comment="defconf: time exceeded " icmp-options=11:0-255 protocol=icmp
add action=accept chain=icmp comment="defconf: allow parameter bad" icmp-options=12:0 protocol=icmp
add action=drop chain=icmp comment="defconf: drop other icmp" protocol=icmp
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=!fin,!syn,!rst,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,syn
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,!ack
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=fin,urg
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=syn,rst
add action=drop chain=bad_tcp comment="defconf: TCP flag filter" protocol=tcp tcp-flags=rst,urg
add action=drop chain=bad_tcp comment="defconf: TCP port 0 drop" port=0 protocol=tcp

Firewall explanation

I will keep this concise, but as stated earlier I suggest you study and understand how iptables function in general and study the packet flow to know how the rules function.

With that being said, I will break it down into simpler points:

  • We are dropping spoofed traffic
    • The RAW rules drop anything coming from WAN that’s spoofed (RFC 6890 addresses)
    • The RAW rules drop anything coming from LAN that does not match your public prefixes/internal subnets (the lan_subnets address list), meaning any spoofing traffic is dropped from exiting your network
    • Here’s an APNIC Blog post detailing more on this subject
  • Next, we are dropping bad traffic such as TCP/UDP port 0 or bad TCP flags
  • The filter rules are pretty self-explanatory
  • We aren’t using filter rules on edge router as it requires connection tracking for certain rules, which should be disabled in the first place for an edge router

This configuration was tested and deployed on AS135756 with Varun Singhania (the proprietor of the ASN).

I will update this information, outside the scope of the initial deployment/testing, as required. However, update frequency requires the cooperation of ASNs willing to test IPv6/BGP (and more) optimization. Currently, this is strictly IPv4; the IPv6 config is ready to to be tested in real-time. If you’re an AS operator who’s keen to do some testing, get in touch via me [at] daryllswer [dot] com.

Daryll Swer is an IT and networking enthusiast who currently works as a Technical Support Engineer at Civo.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Top