BGP.KMCD.DEV

BGP in plain English

Understanding BGP

The Internet's Control Plane

The Border Gateway Protocol (BGP) is the system that connects thousands of independent networks to form the global internet. While interior protocols handle traffic within a single organization, BGP is built for scale and policy-driven routing between Autonomous Systems (ASes).

Networks use BGP to announce which IP addresses they own and discover paths to reach the rest of the world. Because the protocol was built on a model of implicit trust, it is susceptible to complex security issues like route hijacks and leaks, which we will discuss at length later in this guide.

IP Prefixes & Subnetting

IP addresses are grouped into blocks called Prefixes. In BGP, these are often referred to using "slash" notation, such as a "slash 24" (/24), which represents 256 individual addresses.

  • Longest Prefix Match: One of BGP's core rules is that it typically prefers the most specific route. If one network announces a /23 and another announces a /24 that overlaps with it, traffic will commonly follow the /24.
  • Minimum Prefix Size: While subnets can be smaller, /24 is generally the smallest unit accepted on the global BGP table. Prefixes more specific than /24 are usually filtered by ISPs, except for specialized use cases like RTBH.

Network subnets are an entire subject on their own, so go here to learn more about subnetting:

Cloudflare: What is a Subnet? →

Autonomous System (AS)

A network or group of networks under a single administrative control. Major entities like Google, Comcast, and CERN use a unique ASN to identify themselves.

Peering vs Transit

Peering is a direct interconnection between ASes. Transit is a service where a network pays a provider to carry its traffic to the rest of the internet. Peering can often be free, while transit is a paid service.

BGP Sessions & Peering

Two ASes establish a session to exchange routing information. This handshake occurs over TCP port 179. This detail is essential for troubleshooting firewall and security policy blocks.

The Global Routing Table

Often called the Default-Free Zone (DFZ), this is the master list of all known IP prefixes and their best paths.

Path Selection

A BGP Path is the specific chain of Autonomous Systems that data follows as it moves across the global internet. Since the web is a "network of networks," there are often dozens of different ways to reach the same destination. BGP's job is to evaluate all those choices and pick the single "best" route from the list.

BGP makes routing decisions based on network policy. These choices are driven by business relationships, cost, and the overall health of the path. You can see these real-world routing decisions in action using Cloudflare Radar or by looking at the global routing table through RouteViews.

Path selection is a deliberate choice negotiated by network operators. These policies are vital for managing transit costs and maintaining link reliability across the Internet.

How BGP Chooses a Route

Note: This is the core decision process. Real routers evaluate many more tie-breakers and vendor-specific attributes (like Cisco Weight).

01
Local Preference: The primary way networks prioritize outbound paths, often preferring free peering over paid transit.
02
AS Path Length: BGP typically prefers the shortest chain of networks (ASes), though policies and traffic engineering regularly override this.
03
Route Origin: BGP prefers routes originated directly (locally injected) over those learned from neighbors.
04
MED: A technical "hint" used to tell a neighboring network which entry point into your network is preferred for their inbound traffic.
05
eBGP over iBGP: External paths are strongly preferred over internal paths to ensure traffic exits the local network efficiently.
06
IGP Cost to Next Hop: Prefers paths with the lowest internal routing cost to reach the exit router.

BGP Communities

Metadata "tags" attached to routes that signal instructions to upstream peers. Standardized via RFC 1997 and RFC 4360. Important note: most community semantics are operator-defined and not universal.

  • Blackholing: Used for DDoS mitigation. Example: 65535:666 (Standardized RTBH)
  • Traffic Steering: Influencing path priority. Example: ASN:70 (Common convention to Set Local-Pref 70, but varies per ISP)
  • Scoping: Preventing regional leakage. Example: NO_EXPORT (Well-known RFC 1997)

The Session Lifecycle

Before routes can be exchanged, routers must establish a session by progressing through several states in the BGP Finite State Machine (FSM). This sequence ensures that both peers are ready and authorized to communicate over TCP Port 179.

  1. 1
    Idle
    Starting state
  2. 2
    Connect
    Waiting for TCP
  3. 3
    Active
    TCP link up
  4. 4
    OpenSent
    OPEN msg sent
  5. 5
    OpenConfirm
    KEEPALIVE sent
  6. 6
    Established
    Session up

Handshake & Authentication

Security is a major concern in BGP. Since the protocol relies on a persistent TCP connection, it is vulnerable to session resets and spoofing. To mitigate this, BGP sessions are commonly secured with MD5 Signatures (RFC 2385) or the more modern TCP Authentication Option (TCP-AO), though many sessions still run without authentication, particularly at internet exchange points (IXPs).

TTL Security (GTSM)

The Generalized TTL Security Mechanism (RFC 5082) protects sessions by requiring the IP Time-to-Live (TTL) to be exactly 255. Since routers decrement TTL at every hop, this ensures the peer is within 1 hop (or a small hop count), preventing attacks from distant remote networks.

BGP Message Types

  • Open: Identifies the sender and negotiates session parameters (Hold Time, ASN).
  • Update: The core of BGP. Advertises new reachability or withdraws old routes.
  • Keepalive: Periodically exchanged to ensure the peer is still reachable.
  • Notification: Sent when an error is detected. Immediately closes the session.

Anatomy of BGP Messages

Type: Open
Open Message Details
AttributeValue
TYPEOPEN
VERSION4
MY ASN10122
HOLD TIME90
BGP IDENTIFIER10.255.255.36

Details

The first packet sent after the TCP handshake. It establishes the 'ground rules' for the peering session, including optional capabilities like IPv6 support or Route Refresh.

BGP in Action

To truly understand BGP, you must see it in motion. Use the simulation below to walk through the lifecycle of a route. Starting from its initial announcement and path selection across the global internet mesh, to handling failures and anycast failover.

1. Announcing

The Origin AS 'announces' its IP space. Routers propagate this information so that every network knows the path back to the origin.

Step Complete
BGP Route Announcement DiagramUserOrigin AS
Synchronizing Telemetry...

References

Advanced BGP Topics

Explore the complex protocols and architectural standards built on top of BGP's extensible framework.

Step Complete

Path & Scalability

  • eBGP vs iBGP

    External BGP is used between networks while Internal BGP distributes those routes within a single AS.

  • Route Reflection (RFC 4456)

    A method to scale internal networks by reducing the need for every router to talk to every other router.

  • BGP ADD-PATH (RFC 7911)

    Allows advertising multiple paths for the same prefix to enable better ECMP and faster convergence.

  • BGP PIC

    Prefix Independent Convergence allows millisecond failover by using pre-calculated backup paths.

  • Confederations (RFC 5065)

    Dividing a large AS into smaller sub-ASs to simplify management and reduce peering overhead.

Security & Integrity

  • BGPsec (RFC 8205)

    Full path signing. Rarely deployed due to high CPU load; RPKI is the preferred modern alternative.

  • BGP OPSEC (RFC 7454)

    Best practices for securing BGP sessions including TTL security and prefix filtering.

  • RPKI Validation

    Cryptographic verification that an AS is authorized to originate specific IP prefixes.

Traffic Engineering & Resiliency

Modern Overlays

Network Tooling & Resources

A Looking Glass allows engineers to view the routing table from the perspective of a specific remote router.

Step Complete
  • Cloudflare Radar

    Real-time insights into internet traffic, security, and routing patterns globally.

  • PeeringDB

    The industry-standard database for peering locations and network interconnection data.

  • HE BGP Toolkit

    Extensive BGP routing information, including AS details, prefix propagation, and path history.

  • RouteViews

    A global project providing real-time BGP data to researchers since 1995 via dozens of collectors.

  • RIPE NCC RIS

    The Routing Information Service collects and stores BGP routing updates from over 600 peer sessions.