#[doc(hidden)] pub struct Wtf8Buf {
bytes: Vec<u8>,
is_known_utf8: bool,
}
wtf8_internals
)Expand description
An owned, growable string of well-formed WTF-8 data.
Similar to String
, but can additionally contain surrogate code points
if they’re not in a surrogate pair.
Fields§
§bytes: Vec<u8>
wtf8_internals
)is_known_utf8: bool
wtf8_internals
)Do we know that bytes
holds a valid UTF-8 encoding? We can easily
know this if we’re constructed from a String
or &str
.
It is possible for bytes
to have valid UTF-8 without this being
set, such as when we’re concatenating &Wtf8
’s and surrogates become
paired, as we don’t bother to rescan the entire string.
Implementations§
Source§impl Wtf8Buf
impl Wtf8Buf
Sourcepub fn new() -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn new() -> Wtf8Buf
wtf8_internals
)Creates a new, empty WTF-8 string.
Sourcepub fn with_capacity(capacity: usize) -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn with_capacity(capacity: usize) -> Wtf8Buf
wtf8_internals
)Creates a new, empty WTF-8 string with pre-allocated capacity for capacity
bytes.
Sourcepub unsafe fn from_bytes_unchecked(value: Vec<u8>) -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub unsafe fn from_bytes_unchecked(value: Vec<u8>) -> Wtf8Buf
wtf8_internals
)Creates a WTF-8 string from a WTF-8 byte vec.
Since the byte vec is not checked for valid WTF-8, this function is marked unsafe.
Sourcepub const fn from_string(string: String) -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub const fn from_string(string: String) -> Wtf8Buf
wtf8_internals
)Creates a WTF-8 string from a UTF-8 String
.
This takes ownership of the String
and does not copy.
Since WTF-8 is a superset of UTF-8, this always succeeds.
Sourcepub fn from_str(s: &str) -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn from_str(s: &str) -> Wtf8Buf
wtf8_internals
)Creates a WTF-8 string from a UTF-8 &str
slice.
This copies the content of the slice.
Since WTF-8 is a superset of UTF-8, this always succeeds.
pub fn clear(&mut self)
wtf8_internals
)Sourcepub fn from_wide(v: &[u16]) -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn from_wide(v: &[u16]) -> Wtf8Buf
wtf8_internals
)Creates a WTF-8 string from a potentially ill-formed UTF-16 slice of 16-bit code units.
This is lossless: calling .encode_wide()
on the resulting string
will always return the original code units.
Sourceunsafe fn push_code_point_unchecked(&mut self, code_point: CodePoint)
🔬This is a nightly-only experimental API. (wtf8_internals
)
unsafe fn push_code_point_unchecked(&mut self, code_point: CodePoint)
wtf8_internals
)Appends the given char
to the end of this string.
This does not include the WTF-8 concatenation check or is_known_utf8
check.
Copied from String::push.
pub fn as_slice(&self) -> &Wtf8
wtf8_internals
)pub fn as_mut_slice(&mut self) -> &mut Wtf8
wtf8_internals
)Sourcefn as_known_utf8(&self) -> Option<&str>
🔬This is a nightly-only experimental API. (wtf8_internals
)
fn as_known_utf8(&self) -> Option<&str>
wtf8_internals
)Converts the string to UTF-8 without validation, if it was created from valid UTF-8.
Sourcepub fn reserve(&mut self, additional: usize)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn reserve(&mut self, additional: usize)
wtf8_internals
)Reserves capacity for at least additional
more bytes to be inserted
in the given Wtf8Buf
.
The collection may reserve more space to avoid frequent reallocations.
§Panics
Panics if the new capacity exceeds isize::MAX
bytes.
Sourcepub fn try_reserve(&mut self, additional: usize) -> Result<(), TryReserveError>
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn try_reserve(&mut self, additional: usize) -> Result<(), TryReserveError>
wtf8_internals
)Tries to reserve capacity for at least additional
more bytes to be
inserted in the given Wtf8Buf
. The Wtf8Buf
may reserve more space to
avoid frequent reallocations. After calling try_reserve
, capacity will
be greater than or equal to self.len() + additional
. Does nothing if
capacity is already sufficient. This method preserves the contents even
if an error occurs.
§Errors
If the capacity overflows, or the allocator reports a failure, then an error is returned.
pub fn reserve_exact(&mut self, additional: usize)
wtf8_internals
)Sourcepub fn try_reserve_exact(
&mut self,
additional: usize,
) -> Result<(), TryReserveError>
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn try_reserve_exact( &mut self, additional: usize, ) -> Result<(), TryReserveError>
wtf8_internals
)Tries to reserve the minimum capacity for exactly additional
more
bytes to be inserted in the given Wtf8Buf
. After calling
try_reserve_exact
, capacity will be greater than or equal to
self.len() + additional
if it returns Ok(())
.
Does nothing if the capacity is already sufficient.
Note that the allocator may give the Wtf8Buf
more space than it
requests. Therefore, capacity can not be relied upon to be precisely
minimal. Prefer try_reserve
if future insertions are expected.
§Errors
If the capacity overflows, or the allocator reports a failure, then an error is returned.
pub fn shrink_to_fit(&mut self)
wtf8_internals
)pub fn shrink_to(&mut self, min_capacity: usize)
wtf8_internals
)pub fn leak<'a>(self) -> &'a mut Wtf8
wtf8_internals
)Sourcepub fn capacity(&self) -> usize
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn capacity(&self) -> usize
wtf8_internals
)Returns the number of bytes that this string buffer can hold without reallocating.
Sourcepub fn push_str(&mut self, other: &str)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn push_str(&mut self, other: &str)
wtf8_internals
)Append a UTF-8 slice at the end of the string.
Sourcepub fn push_wtf8(&mut self, other: &Wtf8)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn push_wtf8(&mut self, other: &Wtf8)
wtf8_internals
)Append a WTF-8 slice at the end of the string.
This replaces newly paired surrogates at the boundary with a supplementary code point, like concatenating ill-formed UTF-16 strings effectively would.
Sourcepub fn push_char(&mut self, c: char)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn push_char(&mut self, c: char)
wtf8_internals
)Append a Unicode scalar value at the end of the string.
Sourcepub fn push(&mut self, code_point: CodePoint)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn push(&mut self, code_point: CodePoint)
wtf8_internals
)Append a code point at the end of the string.
This replaces newly paired surrogates at the boundary with a supplementary code point, like concatenating ill-formed UTF-16 strings effectively would.
Sourcepub fn truncate(&mut self, new_len: usize)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn truncate(&mut self, new_len: usize)
wtf8_internals
)Shortens a string to the specified length.
§Panics
Panics if new_len
> current length,
or if new_len
is not a code point boundary.
Sourcepub fn into_bytes(self) -> Vec<u8>
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn into_bytes(self) -> Vec<u8>
wtf8_internals
)Consumes the WTF-8 string and tries to convert it to a vec of bytes.
Sourcepub fn into_string(self) -> Result<String, Wtf8Buf>
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn into_string(self) -> Result<String, Wtf8Buf>
wtf8_internals
)Consumes the WTF-8 string and tries to convert it to UTF-8.
This does not copy the data.
If the contents are not well-formed UTF-8 (that is, if the string contains surrogates), the original WTF-8 string is returned instead.
Sourcepub fn into_string_lossy(self) -> String
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn into_string_lossy(self) -> String
wtf8_internals
)Consumes the WTF-8 string and converts it lossily to UTF-8.
This does not copy the data (but may overwrite parts of it in place).
Surrogates are replaced with "\u{FFFD}"
(the replacement character “�”)
Sourcepub fn into_box(self) -> Box<Wtf8>
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn into_box(self) -> Box<Wtf8>
wtf8_internals
)Converts this Wtf8Buf
into a boxed Wtf8
.
Sourcepub fn from_box(boxed: Box<Wtf8>) -> Wtf8Buf
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn from_box(boxed: Box<Wtf8>) -> Wtf8Buf
wtf8_internals
)Converts a Box<Wtf8>
into a Wtf8Buf
.
Sourcepub unsafe fn extend_from_slice_unchecked(&mut self, other: &[u8])
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub unsafe fn extend_from_slice_unchecked(&mut self, other: &[u8])
wtf8_internals
)Provides plumbing to core Vec::extend_from_slice
.
More well behaving alternative to allowing outer types
full mutable access to the core Vec
.
Methods from Deref<Target = Wtf8>§
pub fn to_owned(&self) -> Wtf8Buf
wtf8_internals
)pub fn clone_into(&self, buf: &mut Wtf8Buf)
wtf8_internals
)pub fn to_string_lossy(&self) -> Cow<'_, str>
wtf8_internals
)pub fn into_box(&self) -> Box<Wtf8>
wtf8_internals
)pub fn into_arc(&self) -> Arc<Wtf8>
wtf8_internals
)pub fn into_rc(&self) -> Rc<Wtf8>
wtf8_internals
)pub fn to_ascii_lowercase(&self) -> Wtf8Buf
wtf8_internals
)pub fn to_ascii_uppercase(&self) -> Wtf8Buf
wtf8_internals
)Sourcepub fn len(&self) -> usize
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn len(&self) -> usize
wtf8_internals
)Returns the length, in WTF-8 bytes.
pub fn is_empty(&self) -> bool
wtf8_internals
)Sourcepub fn ascii_byte_at(&self, position: usize) -> u8
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn ascii_byte_at(&self, position: usize) -> u8
wtf8_internals
)Returns the code point at position
if it is in the ASCII range,
or b'\xFF'
otherwise.
§Panics
Panics if position
is beyond the end of the string.
Sourcepub fn code_points(&self) -> Wtf8CodePoints<'_> ⓘ
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn code_points(&self) -> Wtf8CodePoints<'_> ⓘ
wtf8_internals
)Returns an iterator for the string’s code points.
Sourcepub fn as_bytes(&self) -> &[u8]
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn as_bytes(&self) -> &[u8]
wtf8_internals
)Access raw bytes of WTF-8 data
Sourcepub fn as_str(&self) -> Result<&str, Utf8Error>
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn as_str(&self) -> Result<&str, Utf8Error>
wtf8_internals
)Tries to convert the string to UTF-8 and return a &str
slice.
Returns None
if the string contains surrogates.
This does not copy the data.
Sourcepub fn encode_wide(&self) -> EncodeWide<'_> ⓘ
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn encode_wide(&self) -> EncodeWide<'_> ⓘ
wtf8_internals
)Converts the WTF-8 string to potentially ill-formed UTF-16 and return an iterator of 16-bit code units.
This is lossless:
calling Wtf8Buf::from_ill_formed_utf16
on the resulting code units
would always return the original WTF-8 string.
pub fn next_surrogate(&self, pos: usize) -> Option<(usize, u16)>
wtf8_internals
)pub fn final_lead_surrogate(&self) -> Option<u16>
wtf8_internals
)pub fn initial_trail_surrogate(&self) -> Option<u16>
wtf8_internals
)pub fn make_ascii_lowercase(&mut self)
wtf8_internals
)pub fn make_ascii_uppercase(&mut self)
wtf8_internals
)pub fn is_ascii(&self) -> bool
wtf8_internals
)pub fn eq_ignore_ascii_case(&self, other: &Wtf8) -> bool
wtf8_internals
)Sourcepub fn is_code_point_boundary(&self, index: usize) -> bool
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn is_code_point_boundary(&self, index: usize) -> bool
wtf8_internals
)Copied from str::is_char_boundary
Sourcepub fn check_utf8_boundary(&self, index: usize)
🔬This is a nightly-only experimental API. (wtf8_internals
)
pub fn check_utf8_boundary(&self, index: usize)
wtf8_internals
)Verify that index
is at the edge of either a valid UTF-8 codepoint
(i.e. a codepoint that’s not a surrogate) or of the whole string.
These are the cases currently permitted by OsStr::self_encoded_bytes
.
Splitting between surrogates is valid as far as WTF-8 is concerned, but
we do not permit it in the public API because WTF-8 is considered an
implementation detail.
Trait Implementations§
Source§impl Debug for Wtf8Buf
Formats the string in double quotes, with characters escaped according to
char::escape_debug
and unpaired surrogates represented as \u{xxxx}
,
where each x
is a hexadecimal digit.
impl Debug for Wtf8Buf
Formats the string in double quotes, with characters escaped according to
char::escape_debug
and unpaired surrogates represented as \u{xxxx}
,
where each x
is a hexadecimal digit.
For example, the code units [U+0061, U+D800, U+000A] are formatted as
"a\u{D800}\n"
.
Source§impl Display for Wtf8Buf
Formats the string with unpaired surrogates substituted with the replacement
character, U+FFFD.
impl Display for Wtf8Buf
Formats the string with unpaired surrogates substituted with the replacement character, U+FFFD.
Source§impl Extend<CodePoint> for Wtf8Buf
Append code points from an iterator to the string.
impl Extend<CodePoint> for Wtf8Buf
Append code points from an iterator to the string.
This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.
Source§fn extend<T: IntoIterator<Item = CodePoint>>(&mut self, iter: T)
fn extend<T: IntoIterator<Item = CodePoint>>(&mut self, iter: T)
Source§fn extend_one(&mut self, code_point: CodePoint)
fn extend_one(&mut self, code_point: CodePoint)
extend_one
#72631)Source§fn extend_reserve(&mut self, additional: usize)
fn extend_reserve(&mut self, additional: usize)
extend_one
#72631)Source§#[doc(hidden)] unsafe fn extend_one_unchecked(&mut self, item: A)where
Self: Sized,
#[doc(hidden)] unsafe fn extend_one_unchecked(&mut self, item: A)where
Self: Sized,
extend_one_unchecked
)Source§impl FromIterator<CodePoint> for Wtf8Buf
Creates a new WTF-8 string from an iterator of code points.
impl FromIterator<CodePoint> for Wtf8Buf
Creates a new WTF-8 string from an iterator of code points.
This replaces surrogate code point pairs with supplementary code points, like concatenating ill-formed UTF-16 strings effectively would.
Source§impl Ord for Wtf8Buf
impl Ord for Wtf8Buf
Source§impl PartialOrd for Wtf8Buf
impl PartialOrd for Wtf8Buf
Source§#[doc(hidden)] fn __chaining_lt(&self, other: &Rhs) -> ControlFlow<bool>
#[doc(hidden)] fn __chaining_lt(&self, other: &Rhs) -> ControlFlow<bool>
partial_ord_chaining_methods
)self == other
, returns ControlFlow::Continue(())
.
Otherwise, returns ControlFlow::Break(self < other)
. Read moreSource§#[doc(hidden)] fn __chaining_le(&self, other: &Rhs) -> ControlFlow<bool>
#[doc(hidden)] fn __chaining_le(&self, other: &Rhs) -> ControlFlow<bool>
partial_ord_chaining_methods
)__chaining_lt
, but for <=
instead of <
.Source§#[doc(hidden)] fn __chaining_gt(&self, other: &Rhs) -> ControlFlow<bool>
#[doc(hidden)] fn __chaining_gt(&self, other: &Rhs) -> ControlFlow<bool>
partial_ord_chaining_methods
)__chaining_lt
, but for >
instead of <
.Source§#[doc(hidden)] fn __chaining_ge(&self, other: &Rhs) -> ControlFlow<bool>
#[doc(hidden)] fn __chaining_ge(&self, other: &Rhs) -> ControlFlow<bool>
partial_ord_chaining_methods
)__chaining_lt
, but for >=
instead of <
.