The Slice Type in Rust

The Slice Type in Rust

In this lesson, we will look into the slice type in Rust.

GitHub repo with all the code

github.com/codeTIT4N/rust-school

For this lesson: https://github.com/codeTIT4N/rust-school/tree/main/lesson-8

Make sure to star/fork/watch it on GitHub.

Prerequisites

To understand this article you must be familiar with the concept of ownership and references in Rust.

The Slice Type

This is another kind of reference. Which means it will have no ownership.

Definition: Slices let you reference a contiguous sequence of elements in a collection rather than the whole collection. Let's try to understand this:

What problem does it solve?

Consider a problem:

Write a function that takes a string of words separated by spaces and returns the first word it finds in that string. If the function doesn’t find a space in the string, the whole string must be one word, so the entire string should be returned.

Consider the code:

fn main() {
    let mut s = String::from("hello world");
    let word = first_word(&s);
    s.clear();
    println!("{}", word);
}

fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return i;
        }
    }

    s.len()
}

In the above example, we are returning the index of the end of the first word, indicated by a space.

Note: For now, we can ignore how the first_word function is working. Because it is using things like iterators, enumerate, etc which we will learn about in upcoming articles.

For now, let's focus on the signature of the first_word function:

fn first_word(s: &String) -> usize {

Here, we are taking reference to the string as an argument, which is fine because we do not want ownership of the variable in this function.

let mut s = String::from("hello world");
let word = first_word(&s);

Here, we are passing the mutable reference to the String s in the function first_word which should return the index of the first space. If we print the value of word after this it will give: 5

This is all right but the story does not end here. After this we are doing:

s.clear();
println!("{}", word);

s.clear(); will empty the string and make it "". After this when we are printing the value of word, we get:

But the problem is that the underlying string is not present at this moment. So, there's no more string that we could meaningfully use the value 5 with. word is now totally invalid!

Can you see the problem with this?

This program compiles without any errors and would also do so if we used word after calling s.clear(). Because word isn’t connected to the state of s at all, word still contains the value 5. We could use that value 5 with the variable s to try to extract the first word out, but this would be a bug because the contents of s have changed since we saved 5 in word.

Having to worry about the index in word getting out of sync with the data in s is tedious and error prone!

Managing these indices is even more brittle if we write a second_word function. Its signature would have to look like this:

fn second_word(s: &String) -> (usize, usize) {

Now we’re tracking a starting and an ending index, and we have even more values that were calculated from data in a particular state but aren’t tied to that state at all. We have three unrelated variables floating around that need to be kept in sync.

Rust has a solution to this problem: string slices

String Slices

A string slice is a reference to part of a String, and it looks like this:

let s1 = String::from("hello world");

let hello = &s1[0..5];
let world = &s1[6..11];

println!("{hello} {world}");

Rather than a reference to the entire String, hello is a reference to a portion of the String, specified in the extra [0..5] bit. We create slices using a range within brackets by specifying [starting_index..ending_index], where starting_index is the first position in the slice and ending_index is one more than the last position in the slice.

Side note: Internally, the slice data structure stores the starting position and the length of the slice, which corresponds to ending_index minus starting_index. So, in the case of let world = &s[6..11];, world would be a slice that contains a pointer to the byte at index 6 of s with a length value of 5.

Under the hood

Let's see what is happening under the hood:

Here, you can see we have created a string slice which is a completely new reference called world

Ranges in Rust

In our above example, we were using Rust’s .. range syntax.

Let's look into some basic usage of Ranges. If you want to start at index 0, you can drop the value before the two periods. So, the following are the same:

let s2 = String::from("hello");

let slice = &s2[0..2];
let slice = &s2[..2];

By the same token, if your slice includes the last byte of the String, you can drop the trailing number. That means these are equal:

let len = s2.len();

let slice = &s2[3..len];
let slice = &s2[3..];

You can also drop both values to take a slice of the entire string. So these are equal:

let slice = &s2[0..len];
let slice = &s2[..];

More on String Slices

Now, let's fix the problem we discussed earlier with string slices. Let’s rewrite first_word to return a slice. The type that signifies “string slice” is written as &str:

fn first_word(s: &String) -> &str {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return &s[0..i];
        }
    }

    &s[..]
}

Here notice the signature of the function:

fn first_word(s: &String) -> &str {

In this code, we are returning the string slice of the first word.

Now when we call first_word, we get back a single value that is tied to the underlying data. The value is made up of a reference to the starting point of the slice and the number of elements in the slice.

We now have a straightforward API that’s much harder to mess up because the compiler will ensure the references into the String remain valid.

Now, if I compile the following code, It will give error:

fn main() {
    let mut s = String::from("hello world");
    let word = first_word(&s);
    s.clear(); //will produce error
    println!("{}", word);
}

We get:

Recall from the borrowing rules that if we have an immutable reference to something, we cannot also take a mutable reference. Because clear needs to truncate the String, it needs to get a mutable reference. The println! after the call to clear uses the reference in word, so the immutable reference must still be active at that point. Rust disallows the mutable reference in clear and the immutable reference in word from existing at the same time, and compilation fails.

Not only has Rust made our API easier to use, but it has also eliminated an entire class of errors at compile time!

String Literals as Slices

We have talked about String literals in earlier articles. A string literal looks like:

let s = "Hello, world!";

The type of s here is &str: It is what the Rust compiler infers for s

it’s a slice pointing to that specific point of the binary/executable. This is also why string literals are immutable; &str is an immutable reference.

String Slices as Parameters

We can also use string slices as parameters. So, our first_word function signature will now look like this:

fn first_word(s: &str) -> &str {

A more experienced Rustacean would write the signature like this instead because it allows us to use the same function on both &String values and &str values. If we have a string slice, we can pass that directly. If we have a String, we can pass a slice of the String or a reference to the String.

Defining a function to take a string slice instead of a reference to a String makes our API more general and useful without losing any functionality.

Few examples from the Rust book:

fn main() {
    let my_string = String::from("hello world");

    // `first_word` works on slices of `String`s, whether partial or whole
    let word = first_word(&my_string[0..6]);
    let word = first_word(&my_string[..]);
    // `first_word` also works on references to `String`s, which are equivalent
    // to whole slices of `String`s
    let word = first_word(&my_string);

    let my_string_literal = "hello world";

    // `first_word` works on slices of string literals, whether partial or whole
    let word = first_word(&my_string_literal[0..6]);
    let word = first_word(&my_string_literal[..]);

    // Because string literals *are* string slices already,
    // this works too, without the slice syntax!
    let word = first_word(my_string_literal);
}

Other Slices

Slices can also be used with other collections like arrays. Consider the code:

let a = [1, 2, 3, 4, 5];

let slice = &a[1..3];

This is pretty much the same as we used in string slices.

This slice has the type &[i32]. It works the same way as string slices do, by storing a reference to the first element and a length. We'll use this kind of slice for all sorts of other collections, which we will discuss in the upcoming articles of the series.

Summary

The concepts of ownership, borrowing, and slices ensure memory safety in Rust programs at compile time.

The Rust language gives you control over your memory usage in the same way as other systems programming languages, but having the owner of data automatically clean up that data when the owner goes out of scope means you don’t have to write and debug extra code to get this control.