The thing is, slicing in Rust happens by byte index, not char index!
So when you do:
let full = String::from("hello world");
let part = &full[0..5]; // slice from index 0 to 5 (not inclusive)
println!("{}", part); // prints "hello"
Every char here takes a 1 byte space when the string is in English.
'A' => 01000001 (1 byte)
'z' => 01111010 (1 byte)
So when you slice from the 0th to the 4th index on the bytes, you get "hello"!
Unlike this, let's say when you do something like using Unicode characters, Hindi characters or emojis. Let's take an example:
let s = String::from("नमस्ते");
let slice = &s[0..3]; // ⚠️ this will panic!
Why do you think this panicked?
Rust strings are UTF-8 encoded, meaning that other than English, characters or emojis may take more than 1 byte of space.
Character | UTF-8 Encoding | Bytes |
---|---|---|
A | 0x41 | 1 |
ñ | 0xC3 0xB1 | 2 |
न | 0xE0 0xA4 0xA8 | 3 |
😊 | 0xF0 0x9F 0x98 0x8A | 4 |
So yeah, Hindi characters like न use 3 bytes!
So when you slice "नमस्ते" from 0th to 2nd index, you're basically slicing in the middle of a character, which is invalid UTF-8.
So:
&s[0..3] → valid (न)
&s[0..2] → ❌ invalid, slicing inside the byte-sequence of न
Rust is very strict about UTF-8 string encoding, so the rumtime panics!
Then how do you slice properly?
Rust gives you two options:
1. Slice using char indices:
let s = String::from("नमस्ते");
let first_char = s.chars().nth(0).unwrap(); // ✅ gives 'न'
2. Use .get() with range:
let slice = s.get(0..3); // returns Option<&str>
So you can safely do:
if let Some(valid_slice) = s.get(0..3) {
println!("{}", valid_slice);
}
This way, Rust won’t panic if the range is invalid, and it just gives you None as a Result variant.
✅ Always slice only if you know you’re at valid UTF-8 character boundaries.