Rust Generated Code Guide
This page describes exactly what Rust code the protocol buffer compiler generates for any given protocol definition.
Any differences between proto2 and proto3 generated code are highlighted. You should read the proto2 language guide and/or proto3 language guide before reading this document.
Protobuf Rust
Protobuf Rust is an implementation of protocol buffers designed to be able to sit on top of other existing protocol buffer implementations that we refer to as ‘kernels’.
The decision to support multiple non-Rust kernels has significantly influenced
our public API, including the choice to use custom types like ProtoStr
over
Rust std types like str
. See
Rust Proto Design Decisions
for more on this topic.
Generated Filenames
Each rust_proto_library
will be compiled as one crate. Most importantly, for
every .proto
file in the srcs
of the corresponding proto_library
, one Rust
file is emitted, and all these files form a single crate.
Files generated by the compiler vary between kernels. In general, the names of
the output files are computed by taking the name of the .proto
file and
replacing the extension.
Generated files:
- C++ kernel:
.c.pb.rs
- generated Rust code.pb.thunks.cc
- generated C++ thunks (glue code that Rust code calls, and that delegates to the C++ Protobuf APIs).
- C++ Lite kernel:
- <same as C++ kernel>
- UPB kernel
.u.pb.rs
- generated Rust code.
(However,rust_proto_library
relies on the.thunks.c
file produced byupb_proto_aspect
.)
If the proto_library
contains more than one file, the first file is declared a
“primary” file and is treated as the entry point for the crate; that file will
contain both the gencode corresponding to the .proto
file, and also re-exports
for all symbols defined in the files corresponding to all “secondary” files.
Packages
Unlike in most other languages, the package
declarations in the .proto
files
are not used in Rust codegen. Instead, each rust_proto_library(name = "some_rust_proto")
target emits a crate named some_rust_proto
which contains
the generated code for all .proto
files in the target.
Messages
Given the message declaration:
message Foo {}
The compiler generates a struct named Foo
. The Foo
struct defines the
following methods:
fn new() -> Self
: Creates a new instance ofFoo
.fn parse(data: &[u8]) -> Result<Self, protobuf::ParseError>
: Parsesdata
into an instance ofFoo
ifdata
holds a valid wire format representation ofFoo
. Otherwise, the function returns an error.fn clear_and_parse(&mut self, data: &[u8]) -> Result<(), ParseError>
: Like calling.clear()
andparse()
in sequence.fn serialize(&self) -> Result<Vec<u8>, SerializeError>
: Serializes the message to Protobuf wire format. Serialization can fail but rarely will. Failure reasons include exceeding the maximum message size, insufficient memory, and required fields (proto2) that are unset.fn merge_from(&mut self, other)
: Mergesself
withother
.fn as_view(&self) -> FooView<'_>
: Returns an immutable handle (view) toFoo
. This is further covered in the section on proxy types.fn as_mut(&mut self) -> FooMut<'_>
: Returns a mutable handle (mut) toFoo
. This is further covered in the section on proxy types.
Foo
implements the following traits:
std::fmt::Debug
std::default::Default
std::clone::Clone
std::ops::Drop
std::marker::Send
std::marker::Sync
Message Proxy Types
As a consequence of the requirement to support multiple kernels with a single
Rust API, we cannot in some situations use native Rust references (&T
and
&mut T
), but instead, we need to express these concepts using types - View
s
and Mut
s. These situations are shared and mutable references to:
- Messages
- Repeated fields
- Map fields
For example, the compiler emits structs FooView<'a>
and FooMut<'msg>
alongside Foo
. These types are used in place of &Foo
and &mut Foo
, and
they behave the same as native Rust references in terms of borrow checker
behavior. Just like native borrows, Views are Copy
and the borrow checker will
enforce that you can either have any number of Views or at most one Mut live at
a given time.
For the purposes of this documentation, we focus on describing all methods
emitted for the owned message type (Foo
). A subset of these functions with
&self
receiver will also be included on the FooView<'msg>
. A subset of these
functions with either &self
or &mut self
will also be included on the
FooMut<'msg>
.
To create an owned message type from a View / Mut type call to_owned()
, which
creates a deep copy.
Nested Types
Given the message declaration:
message Foo {
message Bar {
enum Baz { ... }
}
}
In addition to the struct named Foo
, a module named foo
is created to
contain the struct for Bar
. And similarly a nested module named bar
to
contain the deeply nested enum Baz
:
pub struct Foo {}
pub mod foo {
pub struct Bar {}
pub mod bar {
pub struct Baz { ... }
}
}
Fields
In addition to the methods described in the previous section, the protocol
buffer compiler generates a set of accessor methods for each field defined
within the message in the .proto
file.
Following Rust style, the methods are in lower-case/snake-case, such as
has_foo()
and clear_foo()
. Note that the capitalization of the field name
portion of the accessor maintains the style from the original .proto file, which
in turn should be lower-case/snake-case per the
.proto file style guide.
Optional Numeric Fields (proto2 and proto3)
For either of these field definitions:
optional int32 foo = 1;
required int32 foo = 1;
The compiler will generate the following accessor methods:
fn has_foo(&self) -> bool
: Returnstrue
if the field is set.fn foo(&self) -> i32
: Returns the current value of the field. If the field is not set, it returns the default value.fn foo_opt(&self) -> protobuf::Optional<i32>
: Returns an optional with the variantSet(value)
if the field is set orUnset(default value)
if it’s unset.fn set_foo(&mut self, val: i32)
: Sets the value of the field. After calling this,has_foo()
will returntrue
andfoo()
will returnvalue
.fn clear_foo(&mut self)
: Clears the value of the field. After calling this,has_foo()
will returnfalse
andfoo()
will return the default value.
For other numeric field types (including bool
), int32
is replaced with the
corresponding C++ type according to the
scalar value types table.
Implicit Presence Numeric Fields (proto3)
For these field definitions:
int32 foo = 1;
fn foo(&self) -> i32
: Returns the current value of the field. If the field is not set, returns0
.fn set_foo(&mut self, val: i32)
: Sets the value of the field. After calling this,foo()
will return value.
For other numeric field types (including bool
), int32
is replaced with the
corresponding C++ type according to the
scalar value types table.
Optional String/Bytes Fields (proto2 and proto3)
For any of these field definitions:
optional string foo = 1;
required string foo = 1;
optional bytes foo = 1;
required bytes foo = 1;
The compiler will generate the following accessor methods:
fn has_foo(&self) -> bool
: Returnstrue
if the field is set.fn foo(&self) -> &protobuf::ProtoStr
: Returns the current value of the field. If the field is not set, it returns the default value.fn foo_opt(&self) -> protobuf::Optional<&ProtoStr>
: Returns an optional with the variantSet(value)
if the field is set orUnset(default value)
if it’s unset.fn clear_foo(&mut self)
: Clears the value of the field. After calling this,has_foo()
will returnfalse
andfoo()
will return the default value.
For fields of type bytes
the compiler will generate the ProtoBytes
type
instead.
Implicit Presence String/Bytes Fields (proto3)
For these field definitions:
optional string foo = 1;
string foo = 1;
optional bytes foo = 1;
bytes foo = 1;
The compiler will generate the following accessor methods:
fn foo(&self) -> &ProtoStr
: Returns the current value of the field. If the field is not set, returns the empty string/empty bytes.fn foo_opt(&self) -> Optional<&ProtoStr>
: Returns an optional with the variantSet(value)
if the field is set orUnset(default value)
if it’s unset.fn set_foo(&mut self, value: IntoProxied<ProtoString>)
: Sets the field tovalue
. After calling this functionfoo()
will returnvalue
andhas_foo()
will returntrue
.fn has_foo(&self) -> bool
: Returnstrue
if the field is set.fn clear_foo(&mut self)
: Clears the value of the field. After calling this,has_foo()
will returnfalse
andfoo()
will return the default value.
For fields of type bytes
the compiler will generate the ProtoBytes
type
instead.
Singular String and Bytes Fields with Cord Support
[ctype = CORD]
enables bytes and strings to be stored as an
absl::Cord
in C++ Protobufs. absl::Cord
currently does not have an equivalent type in
Rust . Protobuf Rust uses an enum to represent a cord
field:
enum ProtoStringCow<'a> {
Owned(ProtoString),
Borrowed(&'a ProtoStr)
}
In the common case, for small strings, an absl::Cord
stores its data as a
contiguous string. In this case cord accessors return
ProtoStringCow::Borrowed
. If the underlying absl::Cord
is non-contiguous,
the accessor copies the data from the cord into an owned ProtoString
and
returns ProtoStringCow::Owned
. The ProtoStringCow
implements
Deref<Target=ProtoStr>
.
For any of these field definitions:
optional string foo = 1 [ctype = CORD];
string foo = 1 [ctype = CORD];
optional bytes foo = 1 [ctype = CORD];
bytes foo = 1 [ctype = CORD];
The compiler generates the following accessor methods:
fn my_field(&self) -> ProtoStringCow<'_>
: Returns the current value of the field. If the field is not set, returns the empty string/empty bytes.fn set_my_field(&mut self, value: IntoProxied<ProtoString>)
: Sets the field tovalue
. After calling this functionfoo()
returnsvalue
andhas_foo()
returnstrue
.fn has_foo(&self) -> bool
: Returnstrue
if the field is set.fn clear_foo(&mut self)
: Clears the value of the field. After calling this,has_foo()
returnsfalse
andfoo()
returns the default value. Cords have not been implemented yet.
For fields of type bytes
the compiler generates the ProtoBytesCow
type
instead.
Optional Enum Fields (proto2 and proto3)
Given the enum type:
enum Bar {
BAR_UNSPECIFIED = 0;
BAR_VALUE = 1;
BAR_OTHER_VALUE = 2;
}
The compiler generates a struct where each variant is an associated constant:
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct Bar(i32);
impl Bar {
pub const Unspecified: Bar = Bar(0);
pub const Value: Bar = Bar(1);
pub const OtherValue: Bar = Bar(2);
}
For either of these field definitions:
optional Bar foo = 1;
required Bar foo = 1;
The compiler will generate the following accessor methods:
fn has_foo(&self) -> bool
: Returnstrue
if the field is set.fn foo(&self) -> Bar
: Returns the current value of the field. If the field is not set, it returns the default value.fn foo_opt(&self) -> Optional<Bar>
: Returns an optional with the variantSet(value)
if the field is set orUnset(default value)
if it’s unset.fn set_foo(&mut self, val: Bar)
: Sets the value of the field. After calling this,has_foo()
will returntrue
andfoo()
will returnvalue
.fn clear_foo(&mut self)
: Clears the value of the field. After calling this,has_foo()
will return false andfoo()
will return the default value.
Implicit Presence Enum Fields (proto3)
Given the enum type:
enum Bar {
BAR_UNSPECIFIED = 0;
BAR_VALUE = 1;
BAR_OTHER_VALUE = 2;
}
The compiler generates a struct where each variant is an associated constant:
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct Bar(i32);
impl Bar {
pub const Unspecified: Bar = Bar(0);
pub const Value: Bar = Bar(1);
pub const OtherValue: Bar = Bar(2);
}
For these field definitions:
Bar foo = 1;
The compiler will generate the following accessor methods:
fn foo(&self) -> Bar
: Returns the current value of the field. If the field is not set, it returns the default value.fn set_foo(&mut self, value: Bar)
: Sets the value of the field. After calling this,has_foo()
will returntrue
andfoo()
will returnvalue
.
Optional Embedded Message Fields (proto2 and proto3)
Given the message type:
message Bar {}
For any of these field definitions:
//proto2
optional Bar foo = 1;
//proto3
Bar foo = 1;
optional Bar foo = 1;
The compiler will generate the following accessor methods:
fn foo(&self) -> BarView<'_>
: Returns a view of the current value of the field. If the field is not set it returns an empty message.fn foo_mut(&mut self) -> BarMut<'_>
: Returns a mutable handle to the current value of the field. Sets the field if it is not set. After calling this method,has_foo()
returns true.fn foo_opt(&self) -> protobuf::Optional<BarView>
: If the field is set, returns the variantSet
with itsvalue
. Else returns the variantUnset
with the default value.fn set_foo(&mut self, value: impl protobuf::IntoProxied<Bar>)
: Sets the field tovalue
. After calling this method,has_foo()
returnstrue
.fn has_foo(&self) -> bool
: Returnstrue
if the field is set.fn clear_foo(&mut self)
: Clears the field. After calling this methodhas_foo()
returnsfalse
.
Repeated Fields
For any repeated field definition the compiler will generate the same three accessor methods that deviate only in the field type.
For example, given the below field definition:
repeated int32 foo = 1;
The compiler will generate the following accessor methods:
fn foo(&self) -> RepeatedView<'_, i32>
: Returns a view of the underlying repeated field.fn foo_mut(&mut self) -> RepeatedMut<'_, i32>
: Returns a mutable handle to the underlying repeated field.fn set_foo(&mut self, src: impl IntoProxied<Repeated<i32>>)
: Sets the underlying repeated field to a new repeated field provided insrc
.
For different field types only the respective generic types of the
RepeatedView
, RepeatedMut
and Repeated
types will change. For example,
given a field of type string
the foo()
accessor would return a
RepeatedView<'_, ProtoString>
.
Map Fields
For this map field definition:
map<int32, int32> weight = 1;
The compiler will generate the following 3 accessor methods:
fn weight(&self) -> protobuf::MapView<'_, i32, i32>
: Returns an immutable view of the underlying map.fn weight_mut(&mut self) -> protobuf::MapMut<'_, i32, i32>
: Returns a mutable handle to the underlying map.fn set_weight(&mut self, src: protobuf::IntoProxied<Map<i32, i32>>)
: Sets the underlying map tosrc
.
For different field types only the respective generic types of the MapView
,
MapMut
and Map
types will change. For example, given a field of type
string
the foo()
accessor would return a MapView<'_, int32, ProtoString>
.
Any
Any is not special-cased by Rust Protobuf at this time; it will behave as though it was a simple message with this definition:
message Any {
string type_url = 1;
bytes value = 2 [ctype = CORD];
}
Oneof
Given a oneof definition like this:
oneof example_name {
int32 foo_int = 4;
string foo_string = 9;
...
}
The compiler will generate accessors (getters, setters, hazzers) for every field
as if the same field was declared as an optional
field outside of the oneof.
So you can work with oneof fields like regular fields, but setting one will
clear the other fields in the oneof block. In addition, the following types are
emitted for the oneof
block:
#[non_exhaustive]
#[derive(Debug, Clone, Copy)]
pub enum ExampleName<'msg> {
FooInt(i32) = 4,
FooString(&'msg protobuf::ProtoStr) = 9,
not_set(std::marker::PhantomData<&'msg ()>) = 0
}
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub enum ExampleNameCase {
FooInt = 4,
FooString = 9,
not_set = 0
}
Additionally, it will generate the two accessors:
fn example_name(&self) -> ExampleName<_>
: Returns the enum variant indicating which field is set and the field’s value. Returnsnot_set
if no field is set.fn example_name_case(&self) -> ExampleNameCase
: Returns the enum variant indicating which field is set. Returnsnot_set
if no field is set.
Enumerations
Given an enum definition like:
enum FooBar {
FOO_BAR_UNKNOWN = 0;
FOO_BAR_A = 1;
FOO_B = 5;
VALUE_C = 1234;
}
The compiler will generate:
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct FooBar(i32);
impl FooBar {
pub const Unknown: FooBar = FooBar(0);
pub const A: FooBar = FooBar(1);
pub const FooB: FooBar = FooBar(5);
pub const ValueC: FooBar = FooBar(1234);
}
Note that for values with a prefix that matches the enum, the prefix will be
stripped; this is done to improve ergonomics. Enum values are commonly prefixed
with the enum name to avoid name collisions between sibling enums (which follow
the semantics of C++ enums where the values are not scoped by their containing
enum). Since the generated Rust consts are scoped within the impl
, the
additional prefix, which is beneficial to add in .proto files, would be
redundant in Rust.
Extensions (proto2 only)
A Rust API for extensions is currently a work in progress. Extension fields will be maintained through parse/serialize, and in a C++ interop case any extensions set will be retained if the message is accessed from Rust (and propagated in the case of a message copy or merge).
Arena Allocation
A Rust API for arena allocated messages has not yet been implemented.
Internally, Protobuf Rust on upb kernel uses arenas, but on C++ kernels it doesn’t. However, references (both const and mutable) to messages that were arena allocated in C++ can be safely passed to Rust to be accessed or mutated.
Services
A Rust API for services has not yet been implemented.