IPv6

Configuring IPv6 statically without NAT

I am trying to configure IPv6 for my home network. With IPv4, I have stable internal addresses because I use a block from RFC 1918 (192.168/16, 172.12/12, 10/8) and then NAT everything to the internet. With IPv6, I get a prefix delegation but the prefix changes occasionally, so I can't configure static v6 addresses for internal services. What do I do?

Use a ULA prefix in addition to your provider-delegated address block(s)!

IPv6 was designed from the very beginning to accommodate multi-homing: this means that each node in the network can have multiple addresses, which it will then choose from (somewhat intelligently) for any given connection.

First, generate a random /48 ULA prefix using the following snippet:

R=$(xxd -p -l 5 /dev/urandom)
printf "fd%s:%s%s:%s%s::/48\n" "${R:0:2}" "${R:2:2}" "${R:4:2}" "${R:6:2}" "${R:8:2}"

Always generate a random ULA prefix. Do not use my example prefix below!

For every distinct internal network (e.g., VLAN) add a /64 prefix from within that /48 to radvd.conf alongside the PD prefix, assigning an address from that prefix to the corresponding router interface and configuring it in that radvd.conf stanza as the router for that prefix. (Since this prefix is permanent, I also recommend setting the lifetime for those addresses to infinity once you are sure everything is working.) For example:

interface eth1.10 {
    .
    .
    .
    prefix fdea:dbee:fdea:1::1/64 {
        AdvValidLifetime infinity;
        AdvPreferredLifetime infinity;
        AdvOnLink on;
        AdvAutonomous on;
        AdvRouterAddr off;
        DecrementLifetimes off;
        DeprecatePrefix off;
    };

You can also provide an explicit route for the entire block, which will allow ULA routing to continue working even when the provider has an extended outage:

    route fdea:dbee:fdea::/48 {
        AdvRouteLifetime infinity;
        RemoveRoute off;
    };

(If you aren't using radvd, systemd almost certainly has some equivalent here.) The end result is that each of your nodes will autogenerate a SLAAC address in both prefixes in response to the next router advertisement.

Now you have static internal prefixes and can assign static addresses within those prefixes to devices that really want static addresses rather than EUI-64 or random/privacy SLAAC addresses, which you can then configure as permanent in DNS. All without having to deal with dynamic DNS for provider-delegated prefixes. In accordance with RFC 6724 (and its soon-to-be-successor), internal connections will use the ULA source addresses, while connections to external systems will use the provider's source addresses. Everyone wins!

What's a good regular expression for matching IPv6 addresses?

The top answer on Stack Overflow probably works fine, but it's hard to understand and includes some addresses I actually don't want to match, like IPv4-embedded addresses and %zones. If you just want to validate vanilla compressed IPv6 addresses, this PCRE, presented as a Python string for readability, is probably correct:

ipv6_valid_re = \
    r'^(?!.*?(::.*?::|:::))' + \
    r'(?!:[^:])' + \
    r'(?!.*[^:]:$)' + \
    r'(?=(:?[^:]*?(:[^:]*?){7}:?|' + \
    r'(?=.*?::)[^:]*?(:[^:]*?){2,6})$)' + \
    r'([0-9A-Fa-f]{0,4}::?){0,7}([0-9A-Fa-f]{0,4})$'

Something this short is possible primarily through extensive use of PCRE's look-ahead matching, which allows it to match the same text multiple times. (You can of course break this into multiple regular expression matches to achieve the same effect, but it's nice to have one giant, ugly RE, amirite?) Breaking this down into functional components:

^(?!.*?(::.*?::|:::)) fails whenever there are two double colons or a triple colon.
^…(?!:[^:]) and ^…(?!.*[^:]:$) fail on a leading or trailing single colon, respectively.
^…(?=(:?[^:]*?(:[^:]*?){7}:?|…)$) accepts any string with seven colons, plus an extra leading and/or trailing colon; but alternatively:
^…(?=(…|(?=.*?::)[^:]*?(:[^:]*?){2,6})$) accepts any string with between two and six colons, as long as it also includes a double colon.
^…([0-9A-Fa-f]{0,4}::?){0,7}([0-9A-Fa-f]{0,4})$ matches any combination of up to eight (possibly double-)colon-separated groups of between zero and four hex characters each.

The logic here is:

(1) refuses to match any attempt to compress more than one sequence of groups, or any obviously malformed attempt at compression.
(2) refuses to match an empty group at the beginning or end.
The disjunction of (3) and (4) requires that there be at least 2 colons (the most compressed addresses having a sequence of 6, 7, or 8 zero-valued groups).
(3) and (4) also require that there be at most 7 colons, plus a possible extra leading or trailing colon for the degenerate compression of 0 into :. (Having both is prohibited by (1).)
(5) otherwise matches any sequence of empty or valid groups separated by one or two colons, with all the malformed variants having been rejected earlier.

I've done extensive testing against some lists of edge-case valid addresses and invalid-but-close strings, but it's possible there's a bug. If anyone finds a case for which this doesn't work, let me know. The one big downside of this PCRE is that it relies on the boundary conditions ^ and $ for correctness, so you can use it as-is only to validate complete addresses, not to search for addresses. To that end...

But I need to search for IPv6 addresses!

Below is a search-capable version of the above PCRE presented as a python string for readability:

ipv6_search_re = \
    r'(?<![:.\w])' + \
    r'(?![:0-9A-Fa-f]*?(::[:0-9A-Fa-f]*?::|:::))' + \
    r'(?!:[0-9A-Fa-f])' + \
    r'(?![:0-9A-Fa-f]*[0-9A-Fa-f]:(?![:.\w]))' + \
    r'(?=(:?[0-9A-Fa-f]*?(:[0-9A-Fa-f]*?){7}:?|' + \
    r'(?=[:0-9A-Fa-f]*?::)[0-9A-Fa-f]*?(:[0-9A-Fa-f]*?){2,6})(?![:.\w]))' + \
    r'([0-9A-Fa-f]{0,4}::?){0,7}([0-9A-Fa-f]{0,4})(?![:.\w])'

The main changes from the validation-only version are:

Use negative look-behinds and look-aheads instead of ^ and $ as boundary conditions, to prevent matching a substring of an otherwise-invalid address
- I chose the character class [:.\w] (where \w is all "word" characters) to represent characters that should be considered part of the address for purposes of a search
- Note that this class includes . to regard the v6-embedded v4 address as part of the search even though the result will not currently validate, to avoid the problem of incorrectly finding the v6 prefix of such an address. I did not do the same with % because ISTM that an IPv6 address is 128 bits, period, and so cannot encode the zone.
When bypassing content irrelevant to a particular test, match only characters valid in IPv6 addresses ([:0-9A-Fa-f]*) rather than any character (.*).
When bypassing the contents of a group or requiring a non-empty group, match on valid hex characters rather than on [^:].

In my very limited testing, this PCRE is more than 4x as efficient as the top answer on Stack Overflow: searching for matches in a 170MB syslogemu.log on my 9950X3D required 26.32 seconds for the Stack Overflow answer's regex, while this PCRE was able to perform the same task in only 6.01 seconds. That answer's regex also missed some addresses, and "found" some MAC address pair substrings (8 groups of the 12 hex pairs in a pair of MAC addresses) and partial addresses ending in a double colon that do not actually exist as a complete address anywhere in that file.