Update: updated for the latest Rust. As of Rust 1.0.0-alpha, to_lowercase()
/to_uppercase()
are now methods in CharExt
trait and there is no separate Ascii
type anymore: ASCII operations are now gathered in two traits, AsciiExt
and OwnedAsciiExt
. They are marked as unstable, so they probably can change throughout the Rust beta period.
Your code is incorrect because it access individual bytes to perform char-based operations, but in UTF-8 characters are not bytes. It won't work correctly for anything which is not ASCII.
In fact, there is no way to do this in-place correctly, because any character conversions may change the number of bytes the character occupy, and this would require full string reallocation. You should iterate over characters and collect them to a new string:
fn titlecase_word(word: &mut String) {
if word.is_empty() { return; }
let mut result = String::with_capacity(word.len());
{
let mut chars = word.chars();
result.push(chars.next().unwrap().to_uppercase());
for c in chars {
result.push(c.to_lowercase());
}
}
*word = result;
}
(try it here)
Because you need generate a new string anyway, it is better just to return it, without replacing the old one. In this case it is also better to pass a slice to the function:
fn titlecase_word(word: &str) -> String {
let mut result = String::with_capacity(word.len());
if !word.is_empty() {
let mut chars = word.chars();
result.push(chars.next().unwrap().to_uppercase());
for c in chars {
result.push(c.to_lowercase());
}
}
result
}
(try it here)
Also String
has extend()
method from Extend
trait which provides a more idiomatic approach as opposed to for
loop:
fn titlecase_word(word: &str) -> String {
let mut result = String::with_capacity(word.len());
if !word.is_empty() {
let mut chars = word.chars();
result.push(chars.next().unwrap().to_uppercase());
result.extend(chars.map(|c| c.to_lowercase()));
}
result
}
(try it here)
In fact, with iterators it is possible to shorten it even further:
fn titlecase_word(word: &str) -> String {
word.chars().enumerate()
.map(|(i, c)| if i == 0 { c.to_uppercase() } else { c.to_lowercase() })
.collect()
}
(try it here)
If you know in advance that you're working with ASCII, however, you could use traits provided by std::ascii
module:
fn titlecase_word(word: String) -> String {
use std::ascii::{AsciiExt, OwnedAsciiExt};
assert!(word.is_ascii());
let mut result = word.into_bytes().into_ascii_lowercase();
result[0] = result[0].to_ascii_uppercase();
String::from_utf8(result).unwrap()
}
(try it here)
This function will fail if the input string contains any non-ASCII character.
This function won't allocate anything and will modify string contents in-place. However, you can't write such function with a single &mut String
argument without unsafe and without extra allocations because it would require moving out from &mut
, and this is disallowed.
You could use std::mem::swap()
and a temporary variable with an empty string, though - it won't require unsafe but it may require an allocation of the empty string. I don't remember if it actually does need an allocation; if not, then you can write such a function, though the code will be somewhat cumbersome. Anyway, &mut
-arguments are not really idiomatic for Rust.