From ce63921d31b4b21d11867e75941fdd6fe06e830d Mon Sep 17 00:00:00 2001 From: Andres Noetzli Date: Fri, 29 Apr 2022 11:45:21 -0700 Subject: [PATCH] Document non-standard string operators (#8679) This adds documentation for `str.indexof_re`, `str.update`, `str.rev`, `str.to_lower`, and `str.to_upper`. Note that it moves the documentation of strings to the "non-standard or extended theories". --- docs/theories/strings.rst | 68 ++++++++++++++++++++++++++++++++++++++ docs/theories/theories.rst | 1 + 2 files changed, 69 insertions(+) create mode 100644 docs/theories/strings.rst diff --git a/docs/theories/strings.rst b/docs/theories/strings.rst new file mode 100644 index 000000000..8a5cad220 --- /dev/null +++ b/docs/theories/strings.rst @@ -0,0 +1,68 @@ +Theory Reference: Strings +========================= + +cvc5 supports all operators of the `SMT-LIB standard for strings +`_. It additionally +supports some non-standard operators that are described below. + +Semantics +^^^^^^^^^ + +.. code-block:: + + * (str.indexof_re String RegLan Int Int) + + Let w₂ = ⟦str.substr⟧(w, i, |w| - i) + + - ⟦str.indexof_re⟧(w, L, i) = -1 if no substring of w₂ is in L or i < 0 + + - ⟦str.indexof_re⟧(w, L, i) = |u₁| + where u₁, w₁ are the shortest words such that + - w₂ = u₁w₁u₂ + - w₁ ∈ L + if some substring of w₂ is in L and i > 0 + + * (str.update String Int String) + + - ⟦str.update⟧(w, i, w₂) = w if i < 0 or i >= |w| + + - ⟦str.update⟧(w, i, w₂) = u₁u₂u₃ + where + - w = u₁w₃u₃ + - |w₃| = |u₂| + - |u₁| = i + - u₂u₄ = w₂ + - |u₂| = min(|w₂|, |w| - i) otherwise + + * (str.rev String String) + + ⟦str.rev⟧(w) is the string obtained by reversing w, e.g., + ⟦str.rev⟧("abc") = "cba". + + * (str.to_lower String String) + + ⟦str.to_lower⟧(w) = w₂ + where + - |w| = |w₂| + - the i-th character ri in w₂ is: + + code(ri) = code(si) + ite(65 <= code(si) <= 90, 32, 0) + + where si is the i-th character in w + + Note: This operator performs the case conversion for the ASCII portion of + Unicode only. + + * (str.to_upper String String) + + ⟦str.to_upper⟧(w) = w₂ + where + - |w| = |w₂| + - the i-th character ri in w₂ is: + + code(ri) = code(si) - ite(97 <= code(si) <= 122, 32, 0) + + where si is the i-th character in w + + Note: This operator performs the case conversion for the ASCII portion of + Unicode only. diff --git a/docs/theories/theories.rst b/docs/theories/theories.rst index fd00cc658..1eb165b9e 100644 --- a/docs/theories/theories.rst +++ b/docs/theories/theories.rst @@ -34,4 +34,5 @@ Non-standard or extended theories separation-logic sequences sets-and-relations + strings transcendentals -- 2.30.2