Changes announced April 20, 2023

Changes announced for Protocol Buffers on April 20, 2023.

Changes to Ruby Generator

This GitHub PR, which will appear in the 23.x release, changes the Ruby code generator to emit a serialized proto instead of the DSL.

It removes the DSL from the code generator in anticipation of splitting the DSL out into a separate package.

Given a .proto file like:

syntax = "proto3";

package pkg;

message TestMessage {
  optional int32 i32 = 1;
  optional TestMessage msg = 2;
}

Generated code before:

# Generated by the protocol buffer compiler.  DO NOT EDIT!
# source: protoc_explorer/main.proto

require 'google/protobuf'

Google::Protobuf::DescriptorPool.generated_pool.build do
  add_file("test.proto", :syntax => :proto3) do
    add_message "pkg.TestMessage" do
      proto3_optional :i32, :int32, 1
      proto3_optional :msg, :message, 2, "pkg.TestMessage"
    end
  end
end

module Pkg
  TestMessage = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.TestMessage").msgclass
end

Generated code after:

# frozen_string_literal: true
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# source: test.proto

require 'google/protobuf'

descriptor_data = "\n\ntest.proto\x12\x03pkg\"S\n\x0bTestMessage\x12\x10\n\x03i32\x18\x01 \x01(\x05H\x00\x88\x01\x01\x12\"\n\x03msg\x18\x02 \x01(\x0b\x32\x10.pkg.TestMessageH\x01\x88\x01\x01\x42\x06\n\x04_i32B\x06\n\x04_msgb\x06proto3"
begin
  Google::Protobuf::DescriptorPool.generated_pool.add_serialized_file(descriptor_data)
rescue TypeError => e
  # <compatibility code, see below>
end

module Pkg
  TestMessage = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.TestMessage").msgclass
end

This change fixes nearly all remaining conformance problems that existed previously. This is a side effect of moving from the DSL (which is lossy) to a serialized descriptor (which preserves all information).

Backward Compatibility

This change should be 100% compatible with Ruby Protobuf >= 3.18.0, released in Sept 2021. Additionally, it should be compatible with all existing users and deployments.

There is some special compatibility code inserted to achieve this level of backward compatibility that you should be aware of. Without the compatibility code, there is an edge case that could break backward compatibility. The previous code is lax in a way that the new code will be more strict.

When using a full serialized descriptor, it contains a list of all .proto files imported by this file (whereas the DSL never added dependencies properly). See the code in descriptor.proto.

add_serialized_file verifies that all dependencies listed in the descriptor were previously added with add_serialized_file. Generally that should be fine, because the generated code will contain Ruby require statements for all dependencies, and the descriptor will fail to load anyway if the types depended on were not previously defined in the DescriptorPool.

But there is a potential for problems if there are ambiguities around file paths. For example, consider the following scenario:

// foo/bar.proto

syntax = "proto2";

message Bar {}
// foo/baz.proto

syntax = "proto2";

import "bar.proto";

message Baz {
  optional Bar bar = 1;
}

If you invoke protoc like so, it will work correctly:

$ protoc --ruby_out=. -Ifoo foo/bar.proto foo/baz.proto
$ RUBYLIB=. ruby baz_pb.rb

However if you invoke protoc like so, and didn’t have any compatibility code, it would fail to load:

$ protoc --ruby_out=. -I. -Ifoo foo/baz.proto
$ protoc --ruby_out=. -I. -Ifoo foo/bar.proto
$ RUBYLIB=foo ruby foo/baz_pb.rb
foo/baz_pb.rb:10:in `add_serialized_file': Unable to build file to DescriptorPool: Depends on file 'bar.proto', but it has not been loaded (Google::Protobuf::TypeError)
    from foo/baz_pb.rb:10:in `<main>'

The problem is that bar.proto is being referred to by two different canonical names: bar.proto and foo/bar.proto. This is a user error: each import should always be referred to by a consistent full path. Hopefully user errors of this sort will be rare, but it is hard to know without trying.

The code in this change prints a warning using warn if we detect that this edge case has occurred:

$ RUBYLIB=foo ruby foo/baz_pb.rb
Warning: Protobuf detected an import path issue while loading generated file foo/baz_pb.rb
- foo/baz.proto imports bar.proto, but that import was loaded as foo/bar.proto
Each proto file must use a consistent fully-qualified name.
This will become an error in the next major version.

There are two possible fixes in this case. One is to consistently use the name bar.proto for the import (removing -I.). The other is to consistently use the name foo/bar.proto for the import (changing the import line to import "foo/bar.proto" and removing -Ifoo).

We plan to remove this compatibility code in the next major version.