Understanding Rust Privacy and Visibility Model

I spent the last couple of months writing code in Rust. It was probably my third or fourth attempt to write something substantial in this language. And every time my level of understanding of things deepened. I'm by no means a Rust expert so probably I'll be extremely inaccurate in the terminology here. And likely I'll get lots of technical details wrong too. But I had this epiphany moment of how the visibility and privacy model works in Rust so I can't help but think of sharing it with someone else.

Where is the root module in Rust

In Rust, code is organized into modules. A module defines an isolated namespace containing named items like structs, traits, functions, consts, etc. But let's start from the beginning - defining the root module.

Let's create a new project (crate):

$ cargo new shiny

And update src/main.rs with the following content:

fn main() {
    println!("module '{}', file '{}'", module_path!(), file!());
}

Running it will give us something like that:

$ cargo run
module 'shiny', file 'src/main.rs'

Thus, a crate itself is our root module.

How to create submodules

In the wild, there is usually a separate .rs file per module. However, unlike in Python or Node.js, files by themselves don't define module borders in Rust. Instead, modules are defined with the special syntax form mod <name> { ... }. There can be an arbitrary number of modules per source file. It's indeed possible to split modules into different files and assemble them back at compile time but I'll touch upon that neat trick closer to the end of the article.

For now, let's keep things simple:

mod submod {
    fn hi() {
        println!("hi:\tmodule '{}', file '{}'", module_path!(), file!());
    }
}

fn main() {
    println!("main:\tmodule '{}', file '{}'", module_path!(), file!());
    submod::hi();
}

Surprisingly or not, the snippet from above won't compile:

$ cargo run
error[E0603]: function `hi` is private
 --> src/main.rs:9:13
  |
9 |     submod::hi();
  |             ^^ private function

On top of the visibility constraints, Rust adds some privacy restrictions. In Rust, almost everything is private by default! As an encapsulation fan, I absolutely love it! ❤️

Ok, let's fix the above snippet by making fn hi() public - pub fn hi().

$ cargo run
main:   module 'shiny', file 'src/main.rs'
hi:     module 'shiny::submod', file 'src/main.rs'

So, back to modules... What did we just learn here? A module can see its direct submodules but not the items in them.

The root module shiny could see its direct descendant submod. Even though it wasn't defined as public. At the same time, shiny could not see the item from submod until it was made public explicitly. That's how items are exposed from modules.

Deeply-nested modules

Let's extend our example:

mod submod {
    pub fn hi() {
        println!("hi:\tmodule '{}', file '{}'", module_path!(), file!());
    }

    mod subsubmod {
        pub fn hey() {
            println!("hey:\tmodule '{}', file '{}'", module_path!(), file!());
        }
    }
}

fn main() {
    println!("main:\tmodule '{}', file '{}'", module_path!(), file!());
    submod::hi();
    submod::subsubmod::hey();
}

Oops, it happened again. The snippet from above won't compile:

$ cargo run
error[E0603]: module `subsubmod` is private
  --> src/main.rs:16:13
   |
16 |     submod::subsubmod::hey();
   |             ^^^^^^^^^ private module

Much like functions, modules are subject to privacy rules too. So, we have to make subsubmod public if we want to use it outside of submod - pub mod subsubmod { ... }.

Running it produces the following output:

$ cargo run
main:   module 'shiny', file 'src/main.rs'
hi:     module 'shiny::submod', file 'src/main.rs'
hey:    module 'shiny::submod::subsubmod', file 'src/main.rs'

As we already know, a module defines an isolated namespace. But all modules combined create a global hierarchy of namespaces! Defining a new module can be thought of as inserting a new namespace into the hierarchy at the location of the definition. Defining a new module with submodules can be thought of as inserting a whole new subtree!

Accessing parent modules from submodules

Let's see what happens if a child module tries to access its parent's items:

mod submod {
    struct Foo {
        bar: i32,
    }

    mod subsubmod {
        fn hey() {
            let _f = Foo { bar: 42 };
        }
    }
}
$ cargo run

error[E0422]: cannot find struct, variant or union type `Foo` in this scope
 --> src/main.rs:8:22
  |
8 |             let _f = Foo { bar: 42 };
  |                      ^^^ not found in this scope

And it won't work again. As we already know, modules define isolated namespaces. Thus, the name Foo is defined in a namespace created by module submod, but not in the nested namespace subsubmod. We can fix the visibility issue here using the following trick: let _f = super::Foo {...}. The parent's namespace is accessible via the super keyword and Foo is defined there (alternatively, we could use the full name of Foo starting from the crate root crate::submod::Foo). But once the visibility issue is resolved, no privacy issue arises!

Ok, what did we just learn?

  • By default, name resolution is relative to the current module/namespace.
  • We can specify item names using absolute or relative paths.
  • We can refer to a parent module using the keyword super.
  • We can refer to the crate's root module using the keyword crate.
  • Submodules (of any depth) can access the private items of their parents!

Improving parent's privacy

But what if we want to give a little bit more privacy to our parents? We can define sibling modules and put the private stuff there:

mod submod {
    pub mod subsubmod1 {
        struct Foo {
            bar: i32,
        }
    }

    pub mod subsubmod2 {
        fn hey() {
            let _f = super::subsubmod1::Foo { bar: 42 };
        }
    }
}

Luckily, it won't compile:

$ cargo run
error[E0603]: struct `Foo` is private
  --> src/main.rs:10:41
   |
10 |             let _f = super::subsubmod1::Foo { bar: 42 };
   |                                         ^^^ private struct

IMO, it's pretty handy that sibling modules are isolated from each other.

Introducing use keyword

So far we have learned how to define modules of any depth. Assuming the privacy constraints are satisfied, we can refer to any module using either relative or absolute path. However, the code would get cumbersome pretty quickly if we were to write it in such an approach.

It'd be great if we could import names from one namespace into another:

mod foo {
    pub mod bar {
        pub mod baz {
            fn f1() {}
            fn f2() {}
            fn f3() {}
        }
    }
}

use foo::bar::baz;

fn main() {    
    baz::f1();  // instead of foo::bar::baz::f1()
}

You can use use (no pun intended) to import a whole module. Or, to import a few items from a module. For instance, the following statement would bring two items from the module baz to the current namespace:

use foo::bar::baz::{f1, f2};

Unlike mod or pub, having use in the language doesn't seem mandatory. It's more like a syntactic sugar to free developers from writing cumbersome module paths everywhere.

Re-exporting items with pub use

What if we want to hide the internal structure of a module from its consumers but at the same time include some of the items from these submodules into the parent's public interface?

mod api {
    mod v1 {
        pub const FOO: i32 = 42;
    }
}

fn main() {
    let foo = api::v1::FOO;
}

As usual, the snippet from above won't compile:

$ cargo run

error[E0603]: module `v1` is private
 --> src/main.rs:8:20
  |
8 |     let foo = api::v1::FOO;
  |                    ^^ private module

Despite FOO being a public constant, we cannot access it outside of submod because the v1 submodule is supposed to be a hidden implementation detail. This reveals another rule Rust relies on to define the privacy restrictions: even if an item itself is public, the full use path should be public too.

Luckily, Rust allows us to re-export public items by combining pub and use in the same clause:

mod api {
    pub use v1::FOO;

    mod v1 {
        pub const FOO: i32 = 42;
    }
}

Since modules are full-fledged items, they can be re-exported too:

mod api {
    pub use v1 as v2;

    pub mod v1 {
        pub const FOO: i32 = 42;
    }
}

fn main() {
    let foo = api::v2::FOO;
}

But only public stuff can be re-exported. For instance, the following snippet wouldn't compile:

mod api {
    pub use v1;

    mod v1 {
        pub const FOO: i32 = 42;
    }
}

Re-exporting public items we create alterantive paths to them. In particular, we can use this technique to hide private modules on the way to public items. By combining use and pub, Rust introduces cross-branch shortcuts in its tree-like module hierarchy making it look like a graph. The resulting language gets much more expressive, but at the same time, it also gets harder to reason about.

Rust's superpower pub(in path)

Things we covered up until this point weren't unique to Rust. To a large extent, Rust module system is a combination of fortunate features borrowed from the mainstream languages. However, here I'm going to talk about a feature that I saw only in Rust (it's likely a borrowing too but I'm not that well-rounded in programming to recognize the origin).

Disclaimer: I'm an apologist of the idea that developers should strive to express as many intents as just possible in their type definitions. Supposedly, it reduces the possibility of design misinterpretation or code misuse. Probably, that's why I find pub(in path) feature so fascinating.

Consider the following situation - you are working on a module that exposes some public API. Internally, it consists of a bunch of private submodules. However, in the end, only a subset of functions is exposed publicly via pub use statement at the top of the module.

mod reader {
    // reader's public interface.
    pub use reader_impl::{read_entry, Entry};

    // Internal module.
    mod decoder {
        // This is public because we need it in `reader_impl` mod.
        pub struct Record {}

        // Same reason for being `pub`.
        pub fn decode(buf: &Vec<u8>) -> Record {
            Record {}
        }
    }

    // Another internal module.
    mod reader_impl {
        use super::decoder::{decode, Record};

        // Re-exported as part of the public interface.
        pub struct Entry {
            r: Record,
        }

        // Re-exported as part of the public interface.
        pub fn read_entry() -> Entry {
            Entry {
                r: read_record(),
            }
        }

        fn read_record() -> Record {
            decode(&vec![1, 2, 3])
        }
    }
}

fn main() {
    let e = reader::read_entry();
}

Supposedly, only fn read_entry() and struct Entry are public items with a relatively stable contract, while struct Record is an internal beast that changes often and hence hidden from the external consumers.

The other day, a friend of yours decides to work on the reader module. However, she is not fully aware of the problem with the volatile Record struct. And she also finds it convenient to have a lower-level read_record() function publicly exposed. So she just adds it to the reader's public API:

mod reader {
    // reader's public interface (extended).
    pub use reader_impl::{read_entry, read_record, Entry};
    
    // ...

    mod reader_impl {        
        // ...

        pub fn read_record() -> Record {
            Record {}
        }
    }
}

fn main() {
    let e = reader::read_entry();
    let r = reader::read_record();
}

Wow, it worked out! There are no compilation errors! Despite the Record struct being a part of a private module...

Even though Record was never exposed from the reader module, it's still possible to return an instance of Record from a public function read_record(). That might be both misleading and dangerous because as the original author of the code we didn't communicate the intent clearly.

On top of that, it seems impossible to specify the type of read_record() return value on the callee side:

fn main() {
    // error[E0603]: module `decoder` is private
    let r: reader::decoder::Record = reader::read_record();
}

Up until this point, combining public and private modules with use imports has been allowing us to cover almost every possible scenario. However, in this particular situation, we need something more powerful...

pub(in <path>) to the rescue!

By default, pub makes an item public globally. Thus, public items residing in private modules are as public as their fully-public counterparts (apparently). It's just the path to these items that's not fully public. However, it seems like one can restrict the scope of the pub modifier!

So, to make our intent of keeping the Record struct private clear, we could rewrite our original snippet as follows:

mod reader {
    // reader's public interface.
    pub use reader_impl::{read_entry, Entry};

    // Internal module.
    mod decoder {
        // CHANGE IS HERE  <--------------
        pub(super) struct Record {}

        // CHANGE IS HERE  <--------------
        pub(super) fn decode(buf: &Vec<u8>) -> Record {
            Record {}
        }
    }

    mod reader_impl {
        use super::decoder::{decode, Record};

        pub struct Entry {
            r: Record,
        }

        pub fn read_entry() -> Entry {
            Entry {
                r: read_record(),
            }
        }

        fn read_record() -> Record {
            decode(&vec![1, 2, 3])
        }
    }
}

fn main() {
    let e = reader::read_entry();
}

With the above code, making reader_impl::read_record function pub would result in the following error:

error[E0446]: restricted type `Record` in public interface
  --> src/main.rs:27:9
   |
8  |         pub(super) struct Record {}
   |         ------------------------ `Record` declared as restricted
...
27 |         pub fn read_record() -> Record {
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't leak restricted type

IMO, it's yet another superpower of Rust!

In general, it's simply impossible to return a private type from a public function. For instance, the following snippet won't compile either:

mod reader {
    struct Entry {}

    pub fn read_entry() -> Entry {
        Entry {}
    }
}

fn main() {
    let e = reader::read_entry();
}
error[E0446]: private type `reader::Entry` in public interface
 --> src/main.rs:4:5
  |
2 |     struct Entry {}
  |     ------------ `reader::Entry` declared as private
3 |
4 |     pub fn read_entry() -> Entry {
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't leak private type

error: type `reader::Entry` is private
  --> src/main.rs:10:13
   |
10 |     let e = reader::read_entry();
   |             ^^^^^^^^^^^^^^^^^^^^ private type

You can read more about it in this Language Internals Forum discussion.

Last but not least, much like with use, pub(in path) can be used with either relative or absolute path. In our example we used pub(super), however, we also could use the absolute form pub(in crate::reader). Don't miss the mandatory in keyword here :)

Separating modules into different files

It's actually very simple to split a source file with multiple modules into multiple files containing submodules. What we saw up until now was a so-called module definition form. However, there is also a module declaration way of dealing with modules.

The following single-file program:

// main.rs
mod foo {
    pub const FOO: i32 = 42;
}

fn main() {
    println!("The answer is {}", foo::FOO);
}

...can be replaced with the next project structure:

// main.rs
mod foo;  //  <--- module declaration

fn main() {
    println!("The answer is {}", foo::FOO);
}

// foo.rs
pub const FOO: i32 = 42;

Much like with use, it's just a form of sugar. Whenever the Rust compiler sees a module definition like mod foo; it just looks up the file foo.rs in the folder of the source file it found the module declaration in. If foo.rs exists, the mod foo; statement is just replaced with the content of the file wrapped into the module definition clause:

// main.rs
mod foo { <foo.rs content goes here> }

There is also a slightly more advanced form when a subfolder named after the module with the mod.rs file in it is used as a module but it's pretty much the same substitution idea:

// main.rs
mod foo { <foo/mod.rs content goes here }

From my understanding, the process starts from the crate root module (residing in either main.rs or lib.rs depending on the crate type) and then recursively goes over every module declaration. That's why you can have utterly broken .rs files in the workspace and the project would still compile just fine if there is no corresponding module declaration for them. The compiler will simply ignore them. That caused quite some confusion for me until I understood the approach.

As a result, a project structure like that arises:

tree ./src/
./src/
├── engine
│   ├── engine.rs
│   └── mod.rs
├── error.rs
├── input
│   ├── decoder
│   │   ├── decoder.rs
│   │   ├── mod.rs
│   │   ├── record.rs
│   │   └── regex.rs
│   ├── input.rs
│   ├── mod.rs
│   └── reader.rs
└── main.rs

And what's really great about module declaration is that it doesn't introduce any new behavioral concepts. Everything that is true for in-place module definitions is still true to module declarations but no more.

Instead of conclusion

It for sure wasn't a complete overview of the visibility and privacy concepts and lots of aspects stayed uncovered. However, for me, these things were the bits that helped to wrap my head around the topic so I could start to make reasonable decisions on how to structure my code or approach the compiler errors.

Have fun and stay tuned to find out when pq alpha is released!

Written by Ivan Velichko

Follow me on twitter @iximiuz