<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.6.22 (Ruby 3.2.0) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-ietf-jsonpath-iregexp-03" category="std" consensus="true" submissionType="IETF" tocInclude="true" sortRefs="true" symRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.16.0 -->
  <front>
    <title abbrev="I-Regexp">I-Regexp: An Interoperable Regexp Format</title>
    <seriesInfo name="Internet-Draft" value="draft-ietf-jsonpath-iregexp-03"/>
    <author initials="C." surname="Bormann" fullname="Carsten Bormann">
      <organization>Universität Bremen TZI</organization>
      <address>
        <postal>
          <street>Postfach 330440</street>
          <city>Bremen</city>
          <code>D-28359</code>
          <country>Germany</country>
        </postal>
        <phone>+49-421-218-63921</phone>
        <email>cabo@tzi.org</email>
      </address>
    </author>
    <author initials="T." surname="Bray" fullname="Tim Bray">
      <organization>Textuality</organization>
      <address>
        <postal>
          <country>Canada</country>
        </postal>
        <email>tbray@textuality.com</email>
      </address>
    </author>
    <date year="2023" month="February" day="04"/>
    <keyword>Internet-Draft</keyword>
    <abstract>
      <t>This document specifies I-Regexp, a flavor of regular expressions that is
limited in scope with the goal of interoperation across many different
regular-expression libraries.</t>
    </abstract>
    <note removeInRFC="true">
      <name>About This Document</name>
      <t>
        Status information for this document may be found at <eref target="https://datatracker.ietf.org/doc/draft-ietf-jsonpath-iregexp/"/>.
      </t>
      <t>
        Discussion of this document takes place on the
        JSONPath Working Group mailing list (<eref target="mailto:JSONPath@ietf.org"/>),
        which is archived at <eref target="https://mailarchive.ietf.org/arch/browse/JSONPath/"/>.
        Subscribe at <eref target="https://www.ietf.org/mailman/listinfo/JSONPath/"/>.
      </t>
      <t>Source for this draft and an issue tracker can be found at
        <eref target="https://github.com/ietf-wg-jsonpath/iregexp"/>.</t>
    </note>
  </front>
  <middle>
    <section anchor="intro">
      <name>Introduction</name>
      <t>This specification describes an interoperable regular expression flavor, I-Regexp.</t>
      <t>I-Regexp does not provide advanced regular expression features such as capture groups, lookahead, or backreferences.
It supports only a Boolean matching capability, i.e., testing whether a given regular expression matches a given piece of text.</t>
      <t>I-Regexp supports the entire repertoire of Unicode characters.</t>
      <t>I-Regexp is a subset of XSD regular expressions <xref target="XSD-2"/>.</t>
      <t>This document includes guidance for converting I-Regexps for use with several well-known regular expression idioms.</t>
      <section anchor="terminology">
        <name>Terminology</name>
        <t>This document uses the abbreviation "regexp" for what are usually
called regular expressions in programming.
"I-Regexp" is used as a noun meaning a character string which conforms to the requirements
in this specification; the plural is "I-Regexps".</t>
        <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
appear in all capitals, as shown here.</t>
        <t>The grammatical rules in this document are to be interpreted as ABNF,
as described in <xref target="RFC5234"/> and <xref target="RFC7405"/>.</t>
      </section>
    </section>
    <section anchor="requirements">
      <name>Requirements</name>
      <t>I-Regexps should handle the vast majority of practical cases where a
matching regexp is needed in a data model specification or a query
language expression.</t>
      <t>The editors of this document conducted a survey of the regexp syntax
used in published RFCs. All examples found there should be covered by I-Regexps,
both syntactically and with their intended semantics.
The exception is the use of multi-character escapes, for which
workaround guidance is provided in <xref target="mapping"/>.</t>
    </section>
    <section anchor="defn">
      <name>I-Regexp Syntax</name>
      <t>An I-Regexp <bcp14>MUST</bcp14> conform to the ABNF specification in
<xref target="iregexp-abnf"/>.</t>
      <figure anchor="iregexp-abnf">
        <name>I-Regexp Syntax in ABNF</name>
        <sourcecode type="abnf"><![CDATA[
i-regexp = branch *( "|" branch )
branch = *piece
piece = atom [ quantifier ]
quantifier = ( %x2A-2B ; '*'-'+'
 / "?" ) / ( "{" quantity "}" )
quantity = QuantExact [ "," [ QuantExact ] ]
QuantExact = 1*%x30-39 ; '0'-'9'

atom = NormalChar / charClass / ( "(" i-regexp ")" )
NormalChar = ( %x00-27 / %x2C-2D ; ','-'-'
 / %x2F-3E ; '/'-'>'
 / %x40-5A ; '@'-'Z'
 / %x5E-7A ; '^'-'z'
 / %x7E-10FFFF )
charClass = "." / SingleCharEsc / charClassEsc / charClassExpr
SingleCharEsc = "\" ( %x28-2B ; '('-'+'
 / %x2D-2E ; '-'-'.'
 / "?" / %x5B-5E ; '['-'^'
 / %s"n" / %s"r" / %s"t" / %x7B-7D ; '{'-'}'
 )
charClassEsc = catEsc / complEsc
charClassExpr = "[" [ "^" ] ( "-" / CCE1 ) *CCE1 [ "-" ] "]"
CCE1 = ( CCchar [ "-" CCchar ] ) / charClassEsc
CCchar = ( %x00-2C / %x2E-5A ; '.'-'Z'
 / %x5E-10FFFF ) / SingleCharEsc
catEsc = %s"\p{" charProp "}"
complEsc = %s"\P{" charProp "}"
charProp = IsCategory
IsCategory = Letters / Marks / Numbers / Punctuation / Separators /
    Symbols / Others
Letters = %s"L" [ ( %x6C-6D ; 'l'-'m'
 / %s"o" / %x74-75 ; 't'-'u'
 ) ]
Marks = %s"M" [ ( %s"c" / %s"e" / %s"n" ) ]
Numbers = %s"N" [ ( %s"d" / %s"l" / %s"o" ) ]
Punctuation = %s"P" [ ( %x63-66 ; 'c'-'f'
 / %s"i" / %s"o" / %s"s" ) ]
Separators = %s"Z" [ ( %s"l" / %s"p" / %s"s" ) ]
Symbols = %s"S" [ ( %s"c" / %s"k" / %s"m" / %s"o" ) ]
Others = %s"C" [ ( %s"c" / %s"f" / %x6E-6F ; 'n'-'o'
 ) ]
]]></sourcecode>
      </figure>
      <t>As an additional restriction, <tt>charClassExpr</tt> is not allowed to
match <tt>[^]</tt>, which according to this grammar would parse as a
positive character class containing the single character <tt>^</tt>.</t>
      <t>This is essentially XSD regexp without character class
subtraction, without multi-character escapes such as <tt>\s</tt>,
<tt>\S</tt>, and <tt>\w</tt>, and without Unicode blocks.</t>
      <t>An I-Regexp implementation <bcp14>MUST</bcp14> be a complete implementation of this
limited subset.
In particular, full Unicode support is <bcp14>REQUIRED</bcp14>; the implementation
<bcp14>MUST NOT</bcp14> limit itself to 7- or 8-bit character sets such as ASCII and
<bcp14>MUST</bcp14> support the Unicode character property set in character classes.</t>
      <section anchor="checking">
        <name>Checking Implementations</name>
        <t>A <em>checking</em> I-Regexp implementation is one that checks a supplied
regexp for compliance with this specification and reports any problems.
Checking implementations give their users confidence that they didn't
accidentally insert non-interoperable syntax, so checking is <bcp14>RECOMMENDED</bcp14>.
Exceptions to this rule may be made for low-effort implementations
that map I-Regexp to another regexp library by simple steps such as
performing the mapping operations discussed in <xref target="mapping"/>; here, the
effort needed to do full checking may dwarf the rest of the
implementation effort.
Implementations <bcp14>SHOULD</bcp14> document whether they are checking or not.</t>
        <t>Specifications that employ I-Regexp may want to define in which
cases their implementations can work with a non-checking I-Regexp
implementation and when full checking is needed, possibly in the
process of defining their own implementation classes.</t>
      </section>
    </section>
    <section anchor="i-regexp-semantics">
      <name>I-Regexp Semantics</name>
      <t>This syntax is a subset of that of <xref target="XSD-2"/>.
Implementations which interpret I-Regexps <bcp14>MUST</bcp14>
yield Boolean results as specified in <xref target="XSD-2"/>.
(See also <xref target="xsd-regexps"/>.)</t>
    </section>
    <section anchor="mapping">
      <name>Mapping I-Regexp to Regexp Dialects</name>
      <t>The material in this section is non-normative, provided as guidance
to developers who want to use I-Regexps in the context of other
regular expression dialects.</t>
      <section anchor="multi-character-escapes">
        <name>Multi-Character Escapes</name>
        <t>Common multi-character escapes (MCEs), and character classes built around them,
which are not supported in I-Regexp, can usually
be replaced as shown for example in <xref target="tbl-sub"/>.</t>
        <table anchor="tbl-sub">
          <name>Example substitutes for multi-character escapes</name>
          <thead>
            <tr>
              <th align="left">MCE/class</th>
              <th align="left">Replace with</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">
                <tt>\S</tt></td>
              <td align="left">
                <tt>[^ \t\n\r]</tt></td>
            </tr>
            <tr>
              <td align="left">
                <tt>[\S ]</tt></td>
              <td align="left">
                <tt>[^\t\n\r]</tt></td>
            </tr>
            <tr>
              <td align="left">
                <tt>\d</tt></td>
              <td align="left">
                <tt>[0-9]</tt></td>
            </tr>
          </tbody>
        </table>
        <t>Note that the semantics of <tt>\d</tt> in XSD regular expressions is that of
<tt>\p{Nd}</tt>; however, this would include all Unicode characters that are
digits in various writing systems, which is almost certainly not what is
required in IETF publications.</t>
        <t>The construct <tt>\p{IsBasicLatin}</tt> is essentially a reference to legacy
ASCII, it can be replaced by the character class <tt>[\u0000-\u007f]</tt>.</t>
      </section>
      <section anchor="xsd-regexps">
        <name>XSD Regexps</name>
        <t>Any I-Regexp also is an XSD Regexp <xref target="XSD-2"/>, so the mapping is an identity
function.</t>
        <t>Note that a few errata for <xref target="XSD-2"/> have been fixed in <xref target="XSD11-2"/>, which
is therefore also included as a normative reference.
XSD 1.1 is less widely implemented than XSD 1.0, and implementations
of XSD 1.0 are likely to include these bugfixes, so for the intents
and purposes of this specification an implementation of XSD 1.0
regexps is equivalent to an implementation of XSD 1.1 regexps.</t>
      </section>
      <section anchor="toESreg">
        <name>ECMAScript Regexps</name>
        <t>Perform the following steps on an I-Regexp to obtain an ECMAScript
regexp <xref target="ECMA-262"/>:</t>
        <ul spacing="normal">
          <li>For any dots (<tt>.</tt>) outside character classes (first alternative
of <tt>charClass</tt> production): replace dot by <tt>[^\n\r]</tt>.</li>
          <li>Envelope the result in <tt>^(?:</tt> and <tt>)$</tt>.</li>
        </ul>
        <t>The ECMAScript regexp is to be interpreted as a Unicode pattern ("u"
flag; see Section 21.2.2 "Pattern Semantics" of <xref target="ECMA-262"/>).</t>
        <t>Note that where a regexp literal is required,
the actual regexp needs to be enclosed in <tt>/</tt>.</t>
      </section>
      <section anchor="pcre-re2-ruby-regexps">
        <name>PCRE, RE2, Ruby Regexps</name>
        <t>Perform the same steps as in <xref target="toESreg"/> to obtain a valid regexp in
PCRE <xref target="PCRE2"/>, the Go programming language <xref target="RE2"/>, and the Ruby
programming language, except that the last step is:</t>
        <ul spacing="normal">
          <li>Enclose the regexp in <tt>\A(?:</tt> and <tt>)\z</tt>.</li>
        </ul>
      </section>
    </section>
    <section anchor="background">
      <name>Motivation and Background</name>
      <t>While regular expressions originally were intended to describe a
formal language to support a Boolean matching function, they
have been enhanced with parsing functions that support the extraction
and replacement of arbitrary portions of the matched text. With this
accretion of features, parsing regexp libraries have become
more susceptible to bugs and surprising performance degradations which
can be exploited in Denial of Service attacks by
an attacker who controls the regexp submitted for
processing. I-Regexp is designed to offer interoperability, and to be
less vulnerable to such attacks, with the trade-off that its only
function is to offer a boolean response as to whether a character
sequence is matched by a regexp.</t>
      <section anchor="subsetting">
        <name>Implementing I-Regexp</name>
        <t>XSD regexps are relatively easy to implement or map to widely
implemented parsing regexp dialects, with these notable
exceptions:</t>
        <ul spacing="normal">
          <li>Character class subtraction.  This is a very useful feature in many
specifications, but it is unfortunately mostly absent from parsing
regexp dialects. Thus, it is omitted from I-Regexp.</li>
          <li>Multi-character escapes.  <tt>\d</tt>, <tt>\w</tt>, <tt>\s</tt> and their uppercase
complement classes exhibit a
large amount of variation between regexp flavors.  Thus, they are
omitted from I-Regexp.</li>
          <li>Not all regexp implementations
support accesses to Unicode tables that enable
executing constructs such as <tt>\p{Nd}</tt>,
although the <tt>\p</tt>/<tt>\P</tt> feature in general is now quite
widely available. While in principle it's possible to
translate these into character-class matches, this also requires
access to those tables. Thus, regexp libraries in severely
constrained environments may not be able to support I-Regexp
conformance.</li>
        </ul>
      </section>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>This document makes no requests of IANA.</t>
    </section>
    <section anchor="security-considerations">
      <name>Security considerations</name>
      <t>As discussed in <xref target="background"/>, more complex regexp libraries may
contain exploitable bugs leading to crashes and remote code
execution.  There is also the problem that such libraries often have
hard-to-predict performance characteristics, leading to attacks
that overload an implementation by matching against an expensive
attacker-controlled regexp.</t>
      <t>I-Regexps have been designed to allow implementation in a way that is
resilient to both threats; this objective needs to be addressed
throughout the implementation effort.
Non-checking implementations (see <xref target="checking"/>) are likely to expose
security limitations of any regexp engine they use, which may be less
problematic if that engine has been built with security considerations
in mind (e.g., <xref target="RE2"/>); a checking implementation is still <bcp14>RECOMMENDED</bcp14>.</t>
    </section>
  </middle>
  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <reference anchor="XSD-2" target="https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/">
          <front>
            <title>XML Schema Part 2: Datatypes Second Edition</title>
            <author fullname="Ashok Malhotra" role="editor"/>
            <author fullname="Paul V. Biron" role="editor"/>
            <date day="28" month="October" year="2004"/>
          </front>
          <seriesInfo name="W3C REC" value="REC-xmlschema-2-20041028"/>
          <seriesInfo name="W3C" value="REC-xmlschema-2-20041028"/>
        </reference>
        <reference anchor="XSD11-2" target="https://www.w3.org/TR/2012/REC-xmlschema11-2-20120405/">
          <front>
            <title>W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes</title>
            <author fullname="Ashok Malhotra" role="editor"/>
            <author fullname="David Peterson" role="editor"/>
            <author fullname="Henry Thompson" role="editor"/>
            <author fullname="Michael Sperberg-McQueen" role="editor"/>
            <author fullname="Paul V. Biron" role="editor"/>
            <author fullname="Sandy Gao" role="editor"/>
            <date day="5" month="April" year="2012"/>
          </front>
          <seriesInfo name="W3C REC" value="REC-xmlschema11-2-20120405"/>
          <seriesInfo name="W3C" value="REC-xmlschema11-2-20120405"/>
        </reference>
        <reference anchor="RFC5234">
          <front>
            <title>Augmented BNF for Syntax Specifications: ABNF</title>
            <author fullname="D. Crocker" initials="D." role="editor" surname="Crocker">
              <organization/>
            </author>
            <author fullname="P. Overell" initials="P." surname="Overell">
              <organization/>
            </author>
            <date month="January" year="2008"/>
            <abstract>
              <t>Internet technical specifications often need to define a formal syntax.  Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications.  The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power.  The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges.  This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications.  [STANDARDS-TRACK]</t>
            </abstract>
          </front>
          <seriesInfo name="STD" value="68"/>
          <seriesInfo name="RFC" value="5234"/>
          <seriesInfo name="DOI" value="10.17487/RFC5234"/>
        </reference>
        <reference anchor="RFC7405">
          <front>
            <title>Case-Sensitive String Support in ABNF</title>
            <author fullname="P. Kyzivat" initials="P." surname="Kyzivat">
              <organization/>
            </author>
            <date month="December" year="2014"/>
            <abstract>
              <t>This document extends the base definition of ABNF (Augmented Backus-Naur Form) to include a way to specify US-ASCII string literals that are matched in a case-sensitive manner.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="7405"/>
          <seriesInfo name="DOI" value="10.17487/RFC7405"/>
        </reference>
        <reference anchor="RFC2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner">
              <organization/>
            </author>
            <date month="March" year="1997"/>
            <abstract>
              <t>In many standards track documents several words are used to signify the requirements in the specification.  These words are often capitalized. This document defines these words as they should be interpreted in IETF documents.  This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
        <reference anchor="RFC8174">
          <front>
            <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
            <author fullname="B. Leiba" initials="B." surname="Leiba">
              <organization/>
            </author>
            <date month="May" year="2017"/>
            <abstract>
              <t>RFC 2119 specifies common key words that may be used in protocol  specifications.  This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the  defined special meanings.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="8174"/>
          <seriesInfo name="DOI" value="10.17487/RFC8174"/>
        </reference>
      </references>
      <references>
        <name>Informative References</name>
        <reference anchor="RE2" target="https://github.com/google/re2">
          <front>
            <title>RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.</title>
            <author>
              <organization/>
            </author>
            <date>n.d.</date>
          </front>
        </reference>
        <reference anchor="PCRE2" target="http://pcre.org/current/doc/html/">
          <front>
            <title>Perl-compatible Regular Expressions (revised API: PCRE2)</title>
            <author>
              <organization/>
            </author>
            <date>n.d.</date>
          </front>
        </reference>
        <reference anchor="ECMA-262" target="https://www.ecma-international.org/wp-content/uploads/ECMA-262.pdf">
          <front>
            <title>ECMAScript 2020 Language Specification</title>
            <author>
              <organization>Ecma International</organization>
            </author>
            <date year="2020" month="June"/>
          </front>
          <seriesInfo name="ECMA" value="Standard ECMA-262, 11th Edition"/>
        </reference>
        <reference anchor="RFC7493">
          <front>
            <title>The I-JSON Message Format</title>
            <author fullname="T. Bray" initials="T." role="editor" surname="Bray">
              <organization/>
            </author>
            <date month="March" year="2015"/>
            <abstract>
              <t>I-JSON (short for "Internet JSON") is a restricted profile of JSON designed to maximize interoperability and increase confidence that software can process it successfully with predictable results.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="7493"/>
          <seriesInfo name="DOI" value="10.17487/RFC7493"/>
        </reference>
      </references>
    </references>
    <section anchor="rfcs" removeInRFC="true">
      <name>Regexps and Similar Constructs in Recent Published RFCs</name>
      <t>This appendix contains a number of regular expressions that have been
extracted from some recently published RFCs based on some ad-hoc matching.
Multi-line constructions were not included.
With the exception of some (often surprisingly dubious) usage of multi-character
escapes and a reference to the <tt>IsBasicLatin</tt> Unicode block, all
regular expressions validate against the ABNF in <xref target="iregexp-abnf"/>.</t>
      <figure anchor="iregexp-examples">
        <name>Example regular expressions extracted from RFCs</name>
        <artwork><![CDATA[
rfc6021.txt  459 (([0-1](\.[1-3]?[0-9]))|(2\.(0|([1-9]\d*))))
rfc6021.txt  513 \d*(\.\d*){1,127}
rfc6021.txt  529 \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?
rfc6021.txt  631 ([0-9a-fA-F]{2}(:[0-9a-fA-F]{2})*)?
rfc6021.txt  647 [0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){5}
rfc6021.txt  933 ((:|[0-9a-fA-F]{0,4}):)([0-9a-fA-F]{0,4}:){0,5}
rfc6021.txt  938 (([^:]+:){6}(([^:]+:[^:]+)|(.*\..*)))|
rfc6021.txt 1026 ((:|[0-9a-fA-F]{0,4}):)([0-9a-fA-F]{0,4}:){0,5}
rfc6021.txt 1031 (([^:]+:){6}(([^:]+:[^:]+)|(.*\..*)))|
rfc6020.txt 6647 [0-9a-fA-F]*
rfc6095.txt 2544 \S(.*\S)?
rfc6110.txt 1583 [aeiouy]*
rfc6110.txt 3222 [A-Z][a-z]*
rfc6536.txt 1583 \*
rfc6536.txt 1632 [^\*].*
rfc6643.txt  524 \p{IsBasicLatin}{0,255}
rfc6728.txt 3480 \S+
rfc6728.txt 3500 \S(.*\S)?
rfc6991.txt  477 (([0-1](\.[1-3]?[0-9]))|(2\.(0|([1-9]\d*))))
rfc6991.txt  525 \d*(\.\d*){1,127}
rfc6991.txt  541 [a-zA-Z_][a-zA-Z0-9\-_.]*
rfc6991.txt  542 .|..|[^xX].*|.[^mM].*|..[^lL].*
rfc6991.txt  571 \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?
rfc6991.txt  665 ([0-9a-fA-F]{2}(:[0-9a-fA-F]{2})*)?
rfc6991.txt  693 [0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){5}
rfc6991.txt  725 ([0-9a-fA-F]{2}(:[0-9a-fA-F]{2})*)?
rfc6991.txt  743 [0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-
rfc6991.txt 1041 ((:|[0-9a-fA-F]{0,4}):)([0-9a-fA-F]{0,4}:){0,5}
rfc6991.txt 1046 (([^:]+:){6}(([^:]+:[^:]+)|(.*\..*)))|
rfc6991.txt 1099 [0-9\.]*
rfc6991.txt 1109 [0-9a-fA-F:\.]*
rfc6991.txt 1164 ((:|[0-9a-fA-F]{0,4}):)([0-9a-fA-F]{0,4}:){0,5}
rfc6991.txt 1169 (([^:]+:){6}(([^:]+:[^:]+)|(.*\..*)))|
rfc7407.txt  933 ([0-9a-fA-F]){2}(:([0-9a-fA-F]){2}){0,254}
rfc7407.txt 1494 ([0-9a-fA-F]){2}(:([0-9a-fA-F]){2}){4,31}
rfc7758.txt  703 \d{2}:\d{2}:\d{2}(\.\d+)?
rfc7758.txt 1358 \d{2}:\d{2}:\d{2}(\.\d+)?
rfc7895.txt  349 \d{4}-\d{2}-\d{2}
rfc7950.txt 8323 [0-9a-fA-F]*
rfc7950.txt 8355 [a-zA-Z_][a-zA-Z0-9\-_.]*
rfc7950.txt 8356 [xX][mM][lL].*
rfc8040.txt 4713 \d{4}-\d{2}-\d{2}
rfc8049.txt 6704 [A-Z]{2}
rfc8194.txt  629 \*
rfc8194.txt  637 [0-9]{8}\.[0-9]{6}
rfc8194.txt  905 Z|[\+\-]\d{2}:\d{2}
rfc8194.txt  963 (2((2[4-9])|(3[0-9]))\.).*
rfc8194.txt  974 (([fF]{2}[0-9a-fA-F]{2}):).*
rfc8299.txt 7986 [A-Z]{2}
rfc8341.txt 1878 \*
rfc8341.txt 1927 [^\*].*
rfc8407.txt 1723 [0-9\.]*
rfc8407.txt 1749 [a-zA-Z_][a-zA-Z0-9\-_.]*
rfc8407.txt 1750 .|..|[^xX].*|.[^mM].*|..[^lL].*
rfc8525.txt  550 \d{4}-\d{2}-\d{2}
rfc8776.txt  838 /?([a-zA-Z0-9\-_.]+)(/[a-zA-Z0-9\-_.]+)*
rfc8776.txt  874 ([a-zA-Z0-9\-_.]+:)*
rfc8819.txt  311 [\S ]+
rfc8944.txt  596 [0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){7}
]]></artwork>
      </figure>
    </section>
    <section numbered="false" anchor="acknowledgements">
      <name>Acknowledgements</name>
      <t>This draft has been motivated by the discussion in the IETF JSONPATH
WG about whether to include a regexp mechanism into the JSONPath query
expression specification, as well as by previous discussions about the
YANG <tt>pattern</tt> and CDDL <tt>.regexp</tt> features.</t>
      <t>The basic approach for this draft was inspired by <xref target="RFC7493">The
I-JSON Message Format</xref>.</t>
    </section>
  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA6Vb63LbyJX+j6fopTc1kkxQvFPkrDMj0/REW5asWJ6aZCQ5
AoEmiRgEGDSgy1BK7Wvsv/2xT7L7Jvsk+53T3QBI0XNJVCqRaPTl9Ll859It
13Wd25HoOE4WZpEcid87Qpy4H+Rc3q9G4jgWJ3Em02QlU28aSaFfiLdJuvQy
x5tOU4nhdoATJH7sLTFNkHqzzA1lNnP/qpJ45WULN0y5kxt5mVSZE+BjJNrN
dsdttt1m13E+y4e7JA1Ges1YZu4bmsbxvWwkVBY4fhIrGatcjUSW5tJR+XQZ
KhUmcfawwmQnk49vHedWxrkcYR9LL4xG4t8v3p+dY/1viZpGks7xZh5mi3w6
Ekzg3byg8dDQ6Dheni2SlGZxhd7S2EtVJmPxmvYex3gjBGYbie/j8FamKsz+
978z8TqVS3T6+OMJd1BZKiWoP09UNvP8heh0mt1uk9/5YfYwMgN0QxJgnTdu
+6jTG5qWPM5S9PpO0qIP3LhaJDH6vewO3W675bZbR26/M2y3+KXUm/a9afJt
9lNo9mv38DFcYkHvoST+o7zPci8CKZsLjr3YC7zqlNkUA7/Niv4NP1k6Tsya
AAYQq/50AeJH4ofOuPFhMnbvl5HyFxjutt12s9ltNdtHulertbMfNaNnq93s
Nnvo+eHtuNfudEfCm8Yz/TzAG/3s+spxwnhWJeDDpD1imivaTD9oF6ESnph5
KqsL5c1kXWSLVHqBO0tDGQfRg/Ai0jqeS2SJmHr+5yzFnzCeC6hFHnmpgG6k
kjVOyHgexlKJKPyM/otESZErGYgwFufjD5O6OJdpVBdeHIjzB7yPG+Ik01SM
X77EMPAzfWhocr10TmpSW2TZSo0OD7WCEosP50kyj+RhKts19KWpd++RlnMx
AHocGlNlkicFyUrswVxDIvL4/GSk59p/TgDWX/mpJN059PM0lXF2CMs+XGTL
6JCImIxPj91239Jhxlra7+7uGtKH1MPYMDSJvYhnu1uBQrRiwnwVJV6gDu1c
jVUwq26L2i/8NFxlhBFN8c6L57k3l+JiJf1wFvo8L48obZU0ndV6gvUNjJj1
+W2BOU232dcGKiF+RWo0MnykhUfiIoPgvDQo9loXrVa2EJMg5HUd13Whhoo0
JHOcjwsIFjzKYcuZUJpEKIdFxjrpXuTdJqlIZju0SUGDPNIOJwqXYabVSPnA
XXEHVcBbKeaJF9HosEBkokR4fpooJQgeRBDOZpLk5Zgl3IrCao0DVQ1N/TIM
gkg6DtiUJkHu82zmZ/0ipNYn51XlxzH7VFUJiEAqiGmKzXpxhTZSwR1Wo5lQ
LxgDWuxX8A+TxEkmVmlyGwZSeMGtF/tgxq6JpJfleBAqB6x6CpC3ogYxT5N8
peoiSpLP3gIWXodKsDWnkpnjEwdgiipfrZI0UyKJyfoB7EkksQfAib8go8eM
3jQkrKuLsCEbgAw4Lnpzt5CQSIpBc6BFvIs+noWYYrqsQulLEh8haHXXBRkk
Y4gOLgjzgYVZQl8xAg6GXIPwFx5pG3xNdTwjCjyhkhl1Brru1K/1mtH56amx
ra1h7Ec5pCjmeRgQvwUwFZ4ghlfj3dqlFL8AyGmdVBIdoJJ3Morcz3Fyt5MP
YRAmSyL4xQs4m3QZxkmUzB+2icCsmgM6pgi1btW0O67xwndkIV5KMAsPFD0g
LoiincqhyHqgRPPUW2LBecOp2T3UiF+M0x7xLYa/E0tInfbplRwmx60FHUK7
wAtyM4rcAtGYyr/lIfvtTMEFoW3bKr7mfqsoJwbhZbG+qjH/pUCwIyjawbvT
7y8+1ur6U5y95+8fJn/8/uTD5A19v/jD8bt3xRfH9Lj4w/vv370pv5Ujx+9P
Tydnb/RgtIqNJqd2evznmnZMtffnH0/enx2/qwm7jUIixGlyg1IbNXibMdcc
a/CMUa/H5//zX60u1Otf4J3brdbw6ck8HLUGXTzAVmK9GtuZfgR3HhxvtZKQ
G2aBIMnawsyLYLmQjFqQOsHEJNh1cEmcuR6Jf5v6q1b396aBNrzRaHm20cg8
e97ybLBm4o6mHcsU3Nxo3+L0Jr3Hf954tnyvNGq1YJWFCkG3RZpHUv16wYjj
12dv6w6+bEhovXYpXIIkSAbmCcETI8ELhAkVXXZKU4cE8igQCwyKJGvzLWIn
wNpfkxSISFCzIlthSn2PzPeO5CU8pwDQtECoWMpAk+ORE/bEEoAWbXmShAD1
b7lMH5zIevvSqI3dSDjgJFWMpBtsgZGSEyNWAA7TW/mg+0hLhnqIM+/esUHa
Kp9GoVrgAbqqGuIYOijvveWKmD4DLgQ0GBsynAC//QSIhwHThxIT6840ITCk
yTU3yJlgsPXbYcpyimn/CkEuEN4HHvJe7n254q2HGvwIW0H0Mo+y0C3BCOL0
VhKWoWEQkOQAOj57KVNZwDYmMZ7TCH4JC4McjKQLj3HBjICPD+QsfnIcyvLs
OzYsg3cW7kivtkQVxs56bRM6rV5Y4+9//7sO1UPX8PyVQMwRA0IP9kTtsWaf
9h3z5ZU4YMfoaPf4SnhZshSX0ALiE0KoVFw7lYdXYk/87r6NiOy1+Fp8dfCV
+9XLrxxxKGrf1MQ+PrHMumaGQ0trT2h2isdX4o/0dXIPvmKVWr2Gv5Wma6xW
eXwlWge/u+803c6QVmtiteFXyA2JxlfijDKPaAwpYV0S1jjyEIgxDXsAVMuD
2j7RUOmtN9Fsuu0BemM7Y7f9hhaoYwGXt4PGt25nQo2HaPq9aew23d4xNX6L
xh9NY2/iDrjxExp/Mo2DidtqvsUPli5peyVqjRreX0ArIknETJRfpX77Edbn
bHbGFFc1LYUjI4W9QgpoRIzBZNNOGoVomM7Xbo9fXeLVJ91f1eKa/kzNZ6Y7
D167A+bJGp2f0LmyDU0GNNFQi5wnwldng2wi9JLEW/tUg1whE5dmHo8nLSjK
AX9ecuO1qF3XHG4gyYzHNI95Zx6uWbeqBDjmTSnLsd7/xEiosSkhK4xt5jtm
G69o71cr6C5Ne44YmnTXsXsz78+fvbcPr8SJGiO7mSdAz/Ir2t/JjEJGLHzq
pZ/p8yxfTnXLeR77SOfZpEGYXAFxGF0PORm6eFhOk4g6vicoVI6di6l5R9yl
3ffHbp9lFWHPSyvYxAiy6w569DLDy5wECRvTlPAsp2YWVfONBshaoRnU11LL
vc+K3oHpFdWK5ah3dUc84rygsuP2+0SID0JmlsqwVqVW1ZSepsIKnuXHYl27
3mprgGEV9754tqfP5nO5Sa1mqx40fjZopjnYn7j9t0R4DMITw0FArbMeiRdV
DNap86vaNtDDFxCE1wjqOUvzAp3FUowhKdLl3K8ubjYs6IYdN9IxuLTkDj4l
S7RrFzeXn65v6iY49nwfYSy5e3YXGKNjGHgq9ptgJLwaRdvOKlEhl1dK1+Yz
LlFRwAs5CCeHo9hCKr1uPt3YtAW/iAcoT2JPa9Id2i053CTPtienKiFn6bxF
2+kLLrbIJm+u1E3dubm6uNGx683VnflmZ7BZ2TRK/M+U4FTdaEhhBIUlWhPZ
qyKA8DRYIWDb7mHCmSL91wkd8tSYGIiQgVIcBAA5ohS7sskciSU29NV5x+bc
jg2WBU8uwkzJaEbSGlC1RBy507DKNaxb8uH4YnxyQvvWs9glaZVnaSlFH5S3
PtAUpHVbkpAmDRwvpK6qnWzQiRz1hW9ekaqKA/t08EXGhpS9S1084d46GV6t
olAGjlEMnc5iYMhhkgnMnhUySLhIvDkXp2IKdjPFWiC6IDjcInjOtUKO8RC6
pazIM8RftAzTRHmOCMIg/ipzYCj0KmO9DWP0z2BesbtZM9FRal2oRPjFsqqa
WDSciY0bVWFzlCsgOH8gLVt6gc7hYbaunM1YRzYpd5g6xIclZzGTB2unsobh
mylRUrireDyyYrkqdMMByRQmWqs10aYoSlMIzkPl50o9i0e/5tSO00DHEGgy
BFARJFrNi+3TtoI7L7XRvMpMZO9saYOeCkazJSaTyBWpgi3fsHQonSqWAtPA
A6jpRpXRVOckpk3K0J/pukOwyETLWRhTRmbic50RmfB/ixwfEEzxu9ZEj5Wg
oKA4S9naG0MPcuct3hS5VV0AXVU4Zd1i5kB/fUAl8YqpM3ICQZRbb01fMdBK
omDTFVv4M/5ks+LEvMFnpcK0LQDtKYp0tVJRIlRxHkIJR2HrbxAwwFlxGcAU
Uo3+FPPvXUiAaQQbWa/vVWAibUpp92kDp0YTq7ptvr2B25B+RlhjtVHnlfBs
Mg2pWmPrOdK3EEMCKg466mWS5ZU1M4d14FZGpP204aRQDUrqyg1r4bDHk/fM
N7Y5Z0fpLDC0atQ8ZY81LiB1oj2W44yT5ZIKjl/waHun44na177rGSCLaR5G
VFKw+e6y7hi3DrMg528QX4ugLGeTDts63JQLlpHna5bo2g3hj0mntfCyaeRC
aThRfBQg6lD7/keIhgdrc6CfR+fRLX+q320TZiDfrIvVjxSQiKvsKr5Kr2/s
DGi8uhD8zB023usOV0FlhqY7tC91BwqvDNE2spqYDZHyoynPpK6JfoH3FHGd
JVnpC8oCAAme1wdvvlSyDZW1LQQiq/VZ8HQD4EQodivTulZSHWKZCi6X0Z4X
i/UkkKcThHM4f1ry1kvDJMf4NOQSr3oAtC+VjenIwqNlAqD14aUQmQFVSBnu
zCGFKYBqnZh8fKuLKQYsTaGGDmuzNEcaTcSfqNeeCv136BI/3WxHcZ4oavNk
MZGce/6Dw7FHXVBwAm2rahk8EhvRVhgJiedN/Lj0MZhd32jLIf4a86MorYLg
DCEhx8RlpxJo2AtXXZvuyl6cTkxnlGzo0lQpZk/M5J2QaUplLlKOYjqx8BAu
TCWBeHhfATU696TVtOvQtSAwJEkNyBn5FjVrg0Ql0xoOkd9qtIjCiED/DkSS
K7BATK51YbbZajQ1HGwHBeb4AO/Z/OlkE3NkBQFEF8Bsms+JfsXsoR1yzMnH
esqheVd5CmckyyLddqC1I/o1C5uYTcf5ULJbQKCG0Z8Z1TIRi0HKytGhRd31
iyyZXKAXLPJchy1M9Syh3IYtgEMbTV3VbyRT0n9qLae1geV6bU8Hn55GjnNA
9yI4dAwSmNneTeNmXyBVUGHwTFcJmGdhqrLqsTOSboKFIgu7IVdjTub2R1b9
aXayAMI0RrQGVp7E2vfYEAmAROp182nvm9GNzmD2//XGmGaFQWWJdmdJ2Svw
ZOVR6h+LvVpec2aRN/8aYCYRImgv2W412o22qJ2bbkXoUNOhQcmo/Q1rMTXj
MuTMpDkysRhTd/hYiLL6yHajkMdSDP2PEhNi3hwak9cn8B8mbfzJwavC/KvC
V97ShrSeMk7KaMlTVfRAyygMClbFDs2Oznx+TnZLk32XVE+cRFHAXq9NJ087
WKbH2dW1burBpbeIqOJOBIIfrF8TvddqVZt2fXVckfLVT8SD0wQKVUaOrz3/
81w7+fWLafGweb77avOw94dFuPMMF1aSwpHEjNx3JL6ivs1BkD57QLrPVzOi
khV4a/PHHaetFkzN4VCJlTJe6FNgjg+onFDtbtxbNTFFXGXyfcfkdGQ2HPlD
F70U6S7nNTRA72dmUJ6ObAN9Qit+sGkiJW4wCIM49ty5XlCykS3Rob8hHSmn
dJYE4ipXnLBRfkdKm88VS0UBJ9OQJzG5FCeogYRuBNXQ2TEOEOtEib0e8EbG
ob4RcCHT2xADYXsepcBQMKry8JNMORileDOlAlX1QISuT2U0G5a26QIdl4rq
4TIEGs5jLdyErhdUD/nN6TjrNlmjw87nNo9ik86yzCmc1KTVy+sMkEEgXcxo
rj6Yg/jCqRpM0kt6YlpmByu6BkYmi9flSXwBsI4CckhzIGJlOn0oQEYjRJGi
bCQK6xc6r8l0YlBWlxR7xFRGDNRQfOkp7RrtPJQ7UkJNRLHzdarOd0tZbGhf
8kNxuE08c4pjIW30460op1LPaghhi2JAKQmdRrKBBNFqKamJuTe24YKx7jQn
nvNpOB32ZDlcEO2Loj6KyKYUnYlZmiwt7Zhki/oGVs9V3cyTWGWiMZX7HQcm
dXkWHoN6CoHrprhGJTeLklRQWUHDKI12hKma6WM+4z3l/SKkuhXdUYvoBpLw
lnR5jeyBoluNfVOZ3Ul9QYMrQXz7RDHbiHBbACDH+0Xqz3QJtADcraBJlKDm
kwFJ1kvrNlmitn4Qs3gFaJd+zopXBMnVyqMO9evoiNhgkeRzbTB4Af92dX5T
le5cxtZhxsmdgMvMaAUT/nm3XhjRooAzRnO+F4FoLuS0LPu///hPZesGZKsY
CdWKFV3RNEoJY09K23K1CprLLSYJ4RjVuGvih+aDLk2xt2IeWGV5Bpd004lS
GjIZYTgCvwtZyPg2TJOYz6a52kI5CLmWAlk044uaibBHlx4Hxc7J8dmxGGNG
sMPUpHZ4vO27KEvvM19E4j1JlbGDoKkwo4OAJ+cDcP+XpjVzHz8rg1X8L+IC
9hBawe+fMwe7dkx13KI/7559CADRFt791FN83Ygd3pLiK9I/x6iagQp21kZg
fEFFlzitC4UClisnM7rtSr4MvjgN3CxxEQEEIRK6qrMqVCNUFO7Vq0QZ0NfF
Rjo7pyt/OwJ5gHMRBnhzbJXCYt6vBI9BgHVlrnFj5trP5v0xVcmvqk6Ljy+e
FY4prrvzHop7d1BdODOTa/CRPt8PzdTXWsmT6V8p1MUC1eDTCwKKimSAPaZk
qXQy8LwEX1Qmz6qlvu264B4F1Ot1UQN/2t9Kw7BJ2BP8m1FBrud7RQxDmYdR
IH05VeMbXILN602BmJy0Y2RPd01EOLMIxcMWgCFmo64Omdteu/WePEwIpduT
jXmjbsPd/a/ZI+/cKGkgdAWIulHU5juJZBqOY+VJynyBPVL8OS6REkt+kD6J
6nzjDge8dzrz1c8Etb/w46xHZDu3Mowx0ZPBBbqmFAfhvT2l4hycDyV/9i5n
oYyOiUetc1GICzGMNgCxbl5DAQMIJsAk7uUF7iLxC+NoONqTRiSkwnXoMFGa
ep2tFTScH2ykVV4zAb0875427jL+BCFBPqWS0D70hYL159dQHFtRJLFsVWzY
P1VrPDeb52N1ssIdNU6lcytyN9bwixsnjJW77pg4EE6/iYwzu8+E6PaGYm/v
sum2rveuGpctt3P9DRfz9vcf99pXjb3m4x5ah9dXwcE+fjZH91odgRcYSa/X
rXqrPXja6tIeosu6++Tib9v8/ch/R5W/PMXL/W82B/c7LUHEDT13duy+vaaO
o83n/YNng7oD8Qtj1r0tKoedDtgweqz2a9a7T/uj/b3tttE+Pp5PcER8/DS6
fon3/Sf7nf+ClY2Dq0aDOPi4Ma7VbPf/qYVbTWLRb1m4yeP6W1w60G+HPX7b
7nW74uqCBl8Y9rZaemCrd9QRl56Etj+YUfZVp91ui8tj98frS8/9ybzsdfrl
uKuttn4HAz5dHVw39It+t2O1Butv1T2x+XbPbH/QPtJLdo+aIPTlZmOv2dyi
fji0+j4Y/HZ9L0b32r0v6HvZpdsStH2w4S/X5gvmv3L/0jAcqXRti8Zjo/F4
+en+T2DBY+Py0/KUv+Bb9M5ypRwwaP1mWyoG9/u9X21L5aBh59faUjFm0P4H
Fhp0Nxc6enKrj91njxvDW81u6x8yo8oE/d9iRuW44ZDpvtqWLoxiWNnRaEeH
fvefo7nVH/4Gmgfd5qACdpUV9llC2y37bG/dp42xre6w+6vGduudlh466B0Z
ETfJWfyMrhZdW53e0S90PTJIBQTY4V+4y7CnYemo0+48g7rK217v5w222rUv
LmGql7DSy8I+j5pd/b47YHe4gxZ0GWrYHTS7GiLtm9awa0yNHOXBVltHgzTZ
A9CKv/W3xg2bPfHj4+XVyyv3usKtrU59iLy9t9e+7BLWPe51DOhdNfYbW4sO
B6SXlzO21C27Hdne7aHe0GB41N/cUKdr1PNocGQ3VLQN24Mq4B8VejUwMrJm
UnkDAf+sfCpde81fA6hHgHEDqBiwW2CDgXZREPqROPxmb2vhl/t7h8+aDrZG
Eh+3O41ML7Db6G8L/oKOWtmHHQ27Rgq9Yf8XkRe+Z/smW3EHe+vMdVfwuBlZ
OxRB8y03n/4dBUni3Nxr3xHo57GO4WVg43z+n9Uy91nqCnp52miyeJM9Ugsf
ffL/lx5//IPzw3fCm1L+V1wvKQ/OigOOpUQ8HYdqqQsrNIv9B1Vz9b1yAWCj
Zsf/F0H/a0OfU7qeJG/5DLekSxkC6PbHn4/PvhM35tRG19XGb968EzcNTUlR
RbLntVOKUyjZSRP6R1V9rlew5Y5PSNQqNPff1+tv+L8x4V33MBpJOG1DnIJy
yh70PwjvU8T+/1l7QaRjPAAA

-->

</rfc>
