Ruby Generated Code Guide
You should read the language guides for proto2 or proto3 before reading this document.
The protocol compiler for Ruby emits Ruby source files that use a DSL to define the message schema. However the DSL is still subject to change. In this guide we only describe the API of the generated messages, and not the DSL.
Compiler Invocation
The protocol buffer compiler produces Ruby output when invoked with the
--ruby_out=
command-line flag. The parameter to the --ruby_out=
option is
the directory where you want the compiler to write your Ruby output. The
compiler creates a .rb
file for each .proto
file input. The names of the
output files are computed by taking the name of the .proto
file and making two
changes:
- The extension (
.proto
) is replaced with_pb.rb
. - The proto path (specified with the
--proto_path=
or-I
command-line flag) is replaced with the output path (specified with the--ruby_out=
flag).
So, for example, let’s say you invoke the compiler as follows:
protoc --proto_path=src --ruby_out=build/gen src/foo.proto src/bar/baz.proto
The compiler will read the files src/foo.proto
and src/bar/baz.proto
and
produce two output files: build/gen/foo_pb.rb
and build/gen/bar/baz_pb.rb
.
The compiler will automatically create the directory build/gen/bar
if
necessary, but it will not create build
or build/gen
; they must already
exist.
Packages
The package name defined in the .proto
file is used to generate a module
structure for the generated messages. Given a file like:
package foo_bar.baz;
message MyMessage {}
The protocol compiler generates an output message with the name
FooBar::Baz::MyMessage
.
However, if the .proto
file contains the ruby_package
option, like this:
option ruby_package = "Foo::Bar";
then the generated output will give precedence to the ruby_package
option
instead and generate Foo::Bar::MyMessage
.
Messages
Given a simple message declaration:
message Foo {}
The protocol buffer compiler generates a class called Foo
. The generated class
derives from the Ruby Object
class (protos have no common base class). Unlike
C++ and Java, Ruby generated code is unaffected by the optimize_for
option in
the .proto
file; in effect, all Ruby code is optimized for code size.
You should not create your own Foo
subclasses. Generated classes are not
designed for subclassing and may lead to "fragile base class" problems.
Ruby message classes define accessors for each field, and also provide the following standard methods:
Message#dup
,Message#clone
: Performs a shallow copy of this message and returns the new copy.Message#==
: Performs a deep equality comparison between two messages.Message#hash
: Computes a shallow hash of the message’s value.Message#to_hash
,Message#to_h
: Converts the object to a rubyHash
object. Only the top-level message is converted.Message#inspect
: Returns a human-readable string representing this message.Message#[]
,Message#[]=
: Gets or sets a field by string name. In the future this will probably also be used to get/set extensions.
The message classes also define the following methods as static. (In general we prefer static methods, since regular methods can conflict with field names you defined in your .proto file.)
Message.decode(str)
: Decodes a binary protobuf for this message and returns it in a new instance.Message.encode(proto)
: Serializes a message object of this class to a binary string.Message.decode_json(str)
: Decodes a JSON text string for this message and returns it in a new instance.Message.encode_json(proto)
: Serializes a message object of this class to a JSON text string.Message.descriptor
: Returns theGoogle::Protobuf::Descriptor
object for this message.
When you create a message, you can conveniently initialize fields in the constructor. Here is an example of constructing and using a message:
message = MyMessage.new(:int_field => 1,
:string_field => "String",
:repeated_int_field => [1, 2, 3, 4],
:submessage_field => SubMessage.new(:foo => 42))
serialized = MyMessage.encode(message)
message2 = MyMessage.decode(serialized)
raise unless message2.int_field == 1
Nested Types
A message can be declared inside another message. For example:
message Foo {
message Bar { }
}
In this case, the Bar
class is declared as a class inside of Foo
, so you can
refer to it as Foo::Bar
.
Fields
For each field in a message type, there are accessor methods to set and get the
field. So given a field foo
you can write:
message.foo = get_value()
print message.foo
Whenever you set a field, the value is type-checked against the declared type of that field. If the value is of the wrong type (or out of range), an exception will be raised.
Singular Fields
For singular primitive fields (numbers, strings, and boolean), the value you assign to the field should be of the correct type and must be in the appropriate range:
- Number types: the value should be a
Fixnum
,Bignum
, orFloat
. The value you assign must be exactly representable in the target type. So assigning1.0
to an int32 field is ok, but assigning1.2
is not. - Boolean fields: the value must be
true
orfalse
. No other values will implicitly convert to true/false. - Bytes fields: the assigned value must be a
String
object. The protobuf library will duplicate the string, convert it to ASCII-8BIT encoding, and freeze it. - String fields: the assigned value must be a
String
object. The protobuf library will duplicate the string, convert it to UTF-8 encoding, and freeze it.
No automatic #to_s
, #to_i
, etc. calls will happen to perform automatic
conversion. You should convert values yourself first, if necessary.
Checking Presence
When using optional
fields, field presence is checked by calling a generated
has_...?
method. Setting any value—even the default value—marks
the field as present. Fields can be cleared by calling a different generated
clear_...
method. For example, for a message MyMessage
with an int32 field
foo
:
m = MyMessage.new
raise unless !m.has_foo?
m.foo = 0
raise unless m.has_foo?
m.clear_foo
raise unless !m.has_foo?
Singular Message Fields
For submessages, unset fields will return nil
, so you can always tell if the
message was explicitly set or not. To clear a submessage field, set its value
explicitly to nil
.
if message.submessage_field.nil?
puts "Submessage field is unset."
else
message.submessage_field = nil
puts "Cleared submessage field."
end
In addition to comparing and assigning nil
, generated messages have has_...
and clear_...
methods, which behave the same as for basic types:
if message.has_submessage_field?
raise unless message.submessage_field == nil
puts "Submessage field is unset."
else
raise unless message.submessage_field != nil
message.clear_submessage_field
raise unless message.submessage_field == nil
puts "Cleared submessage field."
end
When you assign a submessage, it must be a generated message object of the correct type.
It is possible to create message cycles when you assign submessages. For example:
// foo.proto
message RecursiveMessage {
RecursiveMessage submessage = 1;
}
# test.rb
require 'foo'
message = RecursiveSubmessage.new
message.submessage = message
If you try to serialize this, the library will detect the cycle and fail to serialize.
Repeated Fields
Repeated fields are represented using a custom class
Google::Protobuf::RepeatedField
. This class acts like a Ruby Array
and mixes
in Enumerable
. Unlike a regular Ruby array, RepeatedField
is constructed
with a specific type and expects all of the array members to have the correct
type. The types and ranges are checked just like message fields.
int_repeatedfield = Google::Protobuf::RepeatedField.new(:int32, [1, 2, 3])
raise unless !int_repeatedfield.empty?
# Raises TypeError.
int_repeatedfield[2] = "not an int32"
# Raises RangeError
int_repeatedfield[2] = 2**33
message.int32_repeated_field = int_repeatedfield
# This isn't allowed; the regular Ruby array doesn't enforce types like we need.
message.int32_repeated_field = [1, 2, 3, 4]
# This is fine, since the elements are copied into the type-safe array.
message.int32_repeated_field += [1, 2, 3, 4]
# The elements can be cleared without reassigning.
int_repeatedfield.clear
raise unless int_repeatedfield.empty?
The RepeatedField
type supports all of the same methods as a regular Ruby
Array
. You can convert it to a regular Ruby Array with repeated_field.to_a
.
Unlike singular fields, has_...?
methods are never generated for repeated
fields.
Map Fields
Map fields are represented using a special class that acts like a Ruby Hash
(Google::Protobuf::Map
). Unlike a regular Ruby hash, Map
is constructed with
a specific type for the key and value and expects all of the map’s keys and
values to have the correct type. The types and ranges are checked just like
message fields and RepeatedField
elements.
int_string_map = Google::Protobuf::Map.new(:int32, :string)
# Returns nil; items is not in the map.
print int_string_map[5]
# Raises TypeError, value should be a string
int_string_map[11] = 200
# Ok.
int_string_map[123] = "abc"
message.int32_string_map_field = int_string_map
Enumerations
Since Ruby does not have native enums, we create a module for each enum with
constants to define the values. Given the .proto
file:
message Foo {
enum SomeEnum {
VALUE_A = 0;
VALUE_B = 5;
VALUE_C = 1234;
}
optional SomeEnum bar = 1;
}
You can refer to enum values like so:
print Foo::SomeEnum::VALUE_A # => 0
message.bar = Foo::SomeEnum::VALUE_A
You may assign either a number or a symbol to an enum field. When reading the value back, it will be a symbol if the enum value is known, or a number if it is unknown. Since proto3 uses open enum semantics, any number may be assigned to an enum field, even if it was not defined in the enum.
message.bar = 0
puts message.bar.inspect # => :VALUE_A
message.bar = :VALUE_B
puts message.bar.inspect # => :VALUE_B
message.bar = 999
puts message.bar.inspect # => 999
# Raises: RangeError: Unknown symbol value for enum field.
message.bar = :UNDEFINED_VALUE
# Switching on an enum value is convenient.
case message.bar
when :VALUE_A
# ...
when :VALUE_B
# ...
when :VALUE_C
# ...
else
# ...
end
An enum module also defines the following utility methods:
Foo::SomeEnum.lookup(number)
: Looks up the given number and returns its name, ornil
if none was found. If more than one name has this number, returns the first that was defined.Foo::SomeEnum.resolve(symbol)
: Returns the number for this enum name, ornil
if none was found.Foo::SomeEnum.descriptor
: Returns the descriptor for this enum.
Oneof
Given a message with a oneof:
message Foo {
oneof test_oneof {
string name = 1;
int32 serial_number = 2;
}
}
The Ruby class corresponding to Foo
will have members called name
and
serial_number
with accessor methods just like regular fields.
However, unlike regular fields, at most one of the fields in a oneof can be set
at a time, so setting one field will clear the others.
message = Foo.new
# Fields have their defaults.
raise unless message.name == ""
raise unless message.serial_number == 0
raise unless message.test_oneof == nil
message.name = "Bender"
raise unless message.name == "Bender"
raise unless message.serial_number == 0
raise unless message.test_oneof == :name
# Setting serial_number clears name.
message.serial_number = 2716057
raise unless message.name == ""
raise unless message.test_oneof == :serial_number
# Setting serial_number to nil clears the oneof.
message.serial_number = nil
raise unless message.test_oneof == nil
For proto2 messages, oneof members have individual has_...?
methods as well:
message = Foo.new
raise unless !message.has_test_oneof?
raise unless !message.has_name?
raise unless !message.has_serial_number?
raise unless !message.has_test_oneof?
message.name = "Bender"
raise unless message.has_test_oneof?
raise unless message.has_name?
raise unless !message.has_serial_number?
raise unless !message.has_test_oneof?