Wtf8Buf

Struct Wtf8Buf 

Source
#[doc(hidden)] pub struct Wtf8Buf {
    bytes: Vec<u8>,
    is_known_utf8: bool,
}
🔬This is a nightly-only experimental API. (wtf8_internals)
Expand description

An owned, growable string of well-formed WTF-8 data.

Similar to String, but can additionally contain surrogate code points if they’re not in a surrogate pair.

Fields§

§bytes: Vec<u8>
🔬This is a nightly-only experimental API. (wtf8_internals)
§is_known_utf8: bool
🔬This is a nightly-only experimental API. (wtf8_internals)

Do we know that bytes holds a valid UTF-8 encoding? We can easily know this if we’re constructed from a String or &str.

It is possible for bytes to have valid UTF-8 without this being set, such as when we’re concatenating &Wtf8’s and surrogates become paired, as we don’t bother to rescan the entire string.

Implementations§

Source§

impl Wtf8Buf

Source

pub fn new() -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Creates a new, empty WTF-8 string.

Source

pub fn with_capacity(capacity: usize) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Creates a new, empty WTF-8 string with pre-allocated capacity for capacity bytes.

Source

pub unsafe fn from_bytes_unchecked(value: Vec<u8>) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Creates a WTF-8 string from a WTF-8 byte vec.

Since the byte vec is not checked for valid WTF-8, this function is marked unsafe.

Source

pub const fn from_string(string: String) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Creates a WTF-8 string from a UTF-8 String.

This takes ownership of the String and does not copy.

Since WTF-8 is a superset of UTF-8, this always succeeds.

Source

pub fn from_str(s: &str) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Creates a WTF-8 string from a UTF-8 &str slice.

This copies the content of the slice.

Since WTF-8 is a superset of UTF-8, this always succeeds.

Source

pub fn clear(&mut self)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn from_wide(v: &[u16]) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Creates a WTF-8 string from a potentially ill-formed UTF-16 slice of 16-bit code units.

This is lossless: calling .encode_wide() on the resulting string will always return the original code units.

Source

unsafe fn push_code_point_unchecked(&mut self, code_point: CodePoint)

🔬This is a nightly-only experimental API. (wtf8_internals)

Appends the given char to the end of this string. This does not include the WTF-8 concatenation check or is_known_utf8 check. Copied from String::push.

Source

pub fn as_slice(&self) -> &Wtf8

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn as_mut_slice(&mut self) -> &mut Wtf8

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

fn as_known_utf8(&self) -> Option<&str>

🔬This is a nightly-only experimental API. (wtf8_internals)

Converts the string to UTF-8 without validation, if it was created from valid UTF-8.

Source

pub fn reserve(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (wtf8_internals)

Reserves capacity for at least additional more bytes to be inserted in the given Wtf8Buf. The collection may reserve more space to avoid frequent reallocations.

§Panics

Panics if the new capacity exceeds isize::MAX bytes.

Source

pub fn try_reserve(&mut self, additional: usize) -> Result<(), TryReserveError>

🔬This is a nightly-only experimental API. (wtf8_internals)

Tries to reserve capacity for at least additional more bytes to be inserted in the given Wtf8Buf. The Wtf8Buf may reserve more space to avoid frequent reallocations. After calling try_reserve, capacity will be greater than or equal to self.len() + additional. Does nothing if capacity is already sufficient. This method preserves the contents even if an error occurs.

§Errors

If the capacity overflows, or the allocator reports a failure, then an error is returned.

Source

pub fn reserve_exact(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn try_reserve_exact( &mut self, additional: usize, ) -> Result<(), TryReserveError>

🔬This is a nightly-only experimental API. (wtf8_internals)

Tries to reserve the minimum capacity for exactly additional more bytes to be inserted in the given Wtf8Buf. After calling try_reserve_exact, capacity will be greater than or equal to self.len() + additional if it returns Ok(()). Does nothing if the capacity is already sufficient.

Note that the allocator may give the Wtf8Buf more space than it requests. Therefore, capacity can not be relied upon to be precisely minimal. Prefer try_reserve if future insertions are expected.

§Errors

If the capacity overflows, or the allocator reports a failure, then an error is returned.

Source

pub fn shrink_to_fit(&mut self)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn shrink_to(&mut self, min_capacity: usize)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn leak<'a>(self) -> &'a mut Wtf8

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn capacity(&self) -> usize

🔬This is a nightly-only experimental API. (wtf8_internals)

Returns the number of bytes that this string buffer can hold without reallocating.

Source

pub fn push_str(&mut self, other: &str)

🔬This is a nightly-only experimental API. (wtf8_internals)

Append a UTF-8 slice at the end of the string.

Source

pub fn push_wtf8(&mut self, other: &Wtf8)

🔬This is a nightly-only experimental API. (wtf8_internals)

Append a WTF-8 slice at the end of the string.

This replaces newly paired surrogates at the boundary with a supplementary code point, like concatenating ill-formed UTF-16 strings effectively would.

Source

pub fn push_char(&mut self, c: char)

🔬This is a nightly-only experimental API. (wtf8_internals)

Append a Unicode scalar value at the end of the string.

Source

pub fn push(&mut self, code_point: CodePoint)

🔬This is a nightly-only experimental API. (wtf8_internals)

Append a code point at the end of the string.

This replaces newly paired surrogates at the boundary with a supplementary code point, like concatenating ill-formed UTF-16 strings effectively would.

Source

pub fn truncate(&mut self, new_len: usize)

🔬This is a nightly-only experimental API. (wtf8_internals)

Shortens a string to the specified length.

§Panics

Panics if new_len > current length, or if new_len is not a code point boundary.

Source

pub fn into_bytes(self) -> Vec<u8>

🔬This is a nightly-only experimental API. (wtf8_internals)

Consumes the WTF-8 string and tries to convert it to a vec of bytes.

Source

pub fn into_string(self) -> Result<String, Wtf8Buf>

🔬This is a nightly-only experimental API. (wtf8_internals)

Consumes the WTF-8 string and tries to convert it to UTF-8.

This does not copy the data.

If the contents are not well-formed UTF-8 (that is, if the string contains surrogates), the original WTF-8 string is returned instead.

Source

pub fn into_string_lossy(self) -> String

🔬This is a nightly-only experimental API. (wtf8_internals)

Consumes the WTF-8 string and converts it lossily to UTF-8.

This does not copy the data (but may overwrite parts of it in place).

Surrogates are replaced with "\u{FFFD}" (the replacement character “�”)

Source

pub fn into_box(self) -> Box<Wtf8>

🔬This is a nightly-only experimental API. (wtf8_internals)

Converts this Wtf8Buf into a boxed Wtf8.

Source

pub fn from_box(boxed: Box<Wtf8>) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)

Converts a Box<Wtf8> into a Wtf8Buf.

Source

pub unsafe fn extend_from_slice_unchecked(&mut self, other: &[u8])

🔬This is a nightly-only experimental API. (wtf8_internals)

Provides plumbing to core Vec::extend_from_slice. More well behaving alternative to allowing outer types full mutable access to the core Vec.

Methods from Deref<Target = Wtf8>§

Source

pub fn to_owned(&self) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn clone_into(&self, buf: &mut Wtf8Buf)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn to_string_lossy(&self) -> Cow<'_, str>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn into_box(&self) -> Box<Wtf8>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn into_arc(&self) -> Arc<Wtf8>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn into_rc(&self) -> Rc<Wtf8>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn to_ascii_lowercase(&self) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn to_ascii_uppercase(&self) -> Wtf8Buf

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn len(&self) -> usize

🔬This is a nightly-only experimental API. (wtf8_internals)

Returns the length, in WTF-8 bytes.

Source

pub fn is_empty(&self) -> bool

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn ascii_byte_at(&self, position: usize) -> u8

🔬This is a nightly-only experimental API. (wtf8_internals)

Returns the code point at position if it is in the ASCII range, or b'\xFF' otherwise.

§Panics

Panics if position is beyond the end of the string.

Source

pub fn code_points(&self) -> Wtf8CodePoints<'_>

🔬This is a nightly-only experimental API. (wtf8_internals)

Returns an iterator for the string’s code points.

Source

pub fn as_bytes(&self) -> &[u8]

🔬This is a nightly-only experimental API. (wtf8_internals)

Access raw bytes of WTF-8 data

Source

pub fn as_str(&self) -> Result<&str, Utf8Error>

🔬This is a nightly-only experimental API. (wtf8_internals)

Tries to convert the string to UTF-8 and return a &str slice.

Returns None if the string contains surrogates.

This does not copy the data.

Source

pub fn encode_wide(&self) -> EncodeWide<'_>

🔬This is a nightly-only experimental API. (wtf8_internals)

Converts the WTF-8 string to potentially ill-formed UTF-16 and return an iterator of 16-bit code units.

This is lossless: calling Wtf8Buf::from_ill_formed_utf16 on the resulting code units would always return the original WTF-8 string.

Source

pub fn next_surrogate(&self, pos: usize) -> Option<(usize, u16)>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn final_lead_surrogate(&self) -> Option<u16>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn initial_trail_surrogate(&self) -> Option<u16>

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn make_ascii_lowercase(&mut self)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn make_ascii_uppercase(&mut self)

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn is_ascii(&self) -> bool

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn eq_ignore_ascii_case(&self, other: &Wtf8) -> bool

🔬This is a nightly-only experimental API. (wtf8_internals)
Source

pub fn is_code_point_boundary(&self, index: usize) -> bool

🔬This is a nightly-only experimental API. (wtf8_internals)

Copied from str::is_char_boundary

Source

pub fn check_utf8_boundary(&self, index: usize)

🔬This is a nightly-only experimental API. (wtf8_internals)

Verify that index is at the edge of either a valid UTF-8 codepoint (i.e. a codepoint that’s not a surrogate) or of the whole string.

These are the cases currently permitted by OsStr::self_encoded_bytes. Splitting between surrogates is valid as far as WTF-8 is concerned, but we do not permit it in the public API because WTF-8 is considered an implementation detail.

Trait Implementations§

Source§

impl Clone for Wtf8Buf

Source§

fn clone(&self) -> Wtf8Buf

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Wtf8Buf

Formats the string in double quotes, with characters escaped according to char::escape_debug and unpaired surrogates represented as \u{xxxx}, where each x is a hexadecimal digit.

For example, the code units [U+0061, U+D800, U+000A] are formatted as "a\u{D800}\n".

Source§

fn fmt(&self, formatter: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Deref for Wtf8Buf

Source§

type Target = Wtf8

The resulting type after dereferencing.
Source§

fn deref(&self) -> &Wtf8

Dereferences the value.
Source§

impl DerefMut for Wtf8Buf

Source§

fn deref_mut(&mut self) -> &mut Wtf8

Mutably dereferences the value.
Source§

impl Display for Wtf8Buf

Formats the string with unpaired surrogates substituted with the replacement character, U+FFFD.

Source§

fn fmt(&self, formatter: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Eq for Wtf8Buf

Source§

#[doc(hidden)] fn assert_receiver_is_total_eq(&self)

Source§

impl Extend<CodePoint> for Wtf8Buf

Append code points from an iterator to the string.

This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.

Source§

fn extend<T: IntoIterator<Item = CodePoint>>(&mut self, iter: T)

Extends a collection with the contents of an iterator. Read more
Source§

fn extend_one(&mut self, code_point: CodePoint)

🔬This is a nightly-only experimental API. (extend_one #72631)
Extends a collection with exactly one element.
Source§

fn extend_reserve(&mut self, additional: usize)

🔬This is a nightly-only experimental API. (extend_one #72631)
Reserves capacity in a collection for the given number of additional elements. Read more
Source§

#[doc(hidden)] unsafe fn extend_one_unchecked(&mut self, item: A)
where Self: Sized,

🔬This is a nightly-only experimental API. (extend_one_unchecked)
Extends a collection with one element, without checking there is enough capacity for it. Read more
Source§

impl FromIterator<CodePoint> for Wtf8Buf

Creates a new WTF-8 string from an iterator of code points.

This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.

Source§

fn from_iter<T: IntoIterator<Item = CodePoint>>(iter: T) -> Wtf8Buf

Creates a value from an iterator. Read more
Source§

impl Hash for Wtf8Buf

Source§

fn hash<H: Hasher>(&self, state: &mut H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl Ord for Wtf8Buf

Source§

fn cmp(&self, other: &Wtf8Buf) -> Ordering

This method returns an Ordering between self and other. Read more
1.21.0 · Source§

fn max(self, other: Self) -> Self
where Self: Sized,

Compares and returns the maximum of two values. Read more
1.21.0 · Source§

fn min(self, other: Self) -> Self
where Self: Sized,

Compares and returns the minimum of two values. Read more
1.50.0 · Source§

fn clamp(self, min: Self, max: Self) -> Self
where Self: Sized,

Restrict a value to a certain interval. Read more
Source§

impl PartialEq for Wtf8Buf

Source§

fn eq(&self, other: &Wtf8Buf) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl PartialOrd for Wtf8Buf

Source§

fn partial_cmp(&self, other: &Wtf8Buf) -> Option<Ordering>

This method returns an ordering between self and other values if one exists. Read more
1.0.0 · Source§

fn lt(&self, other: &Rhs) -> bool

Tests less than (for self and other) and is used by the < operator. Read more
1.0.0 · Source§

fn le(&self, other: &Rhs) -> bool

Tests less than or equal to (for self and other) and is used by the <= operator. Read more
1.0.0 · Source§

fn gt(&self, other: &Rhs) -> bool

Tests greater than (for self and other) and is used by the > operator. Read more
1.0.0 · Source§

fn ge(&self, other: &Rhs) -> bool

Tests greater than or equal to (for self and other) and is used by the >= operator. Read more
Source§

#[doc(hidden)] fn __chaining_lt(&self, other: &Rhs) -> ControlFlow<bool>

🔬This is a nightly-only experimental API. (partial_ord_chaining_methods)
If self == other, returns ControlFlow::Continue(()). Otherwise, returns ControlFlow::Break(self < other). Read more
Source§

#[doc(hidden)] fn __chaining_le(&self, other: &Rhs) -> ControlFlow<bool>

🔬This is a nightly-only experimental API. (partial_ord_chaining_methods)
Same as __chaining_lt, but for <= instead of <.
Source§

#[doc(hidden)] fn __chaining_gt(&self, other: &Rhs) -> ControlFlow<bool>

🔬This is a nightly-only experimental API. (partial_ord_chaining_methods)
Same as __chaining_lt, but for > instead of <.
Source§

#[doc(hidden)] fn __chaining_ge(&self, other: &Rhs) -> ControlFlow<bool>

🔬This is a nightly-only experimental API. (partial_ord_chaining_methods)
Same as __chaining_lt, but for >= instead of <.
Source§

impl StructuralPartialEq for Wtf8Buf