Kotlin Generated Code Guide

Describes exactly what Kotlin code the protocol buffer compiler generates for any given protocol definition, in addition to the code generated for Java.

Any differences between proto2 and proto3 generated code are highlighted—note that these differences are in the generated code as described in this document, not the base message classes/interfaces, which are the same in both versions. You should read the proto2 language guide and/or proto3 language guide before reading this document.

Compiler Invocation

The protocol buffer compiler produces Kotlin code that builds on top of Java code. As a result, it must be invoked with two command-line flags, --java_out= and --kotlin_out=. The parameter to the --java_out= option is the directory where you want the compiler to write your Java output, and the same for the --kotlin_out=. For each .proto file input, the compiler creates a wrapper .java file containing a Java class which represents the .proto file itself.

Regardless of whether or not your .proto file contains a line like the following:

option java_multiple_files = true;

The compiler will create separate .kt files for each of the classes and factory methods which it will generate for each top-level message declared in the .proto file.

The Java package name for each file is the same as that used by the generated Java code as described in the Java generated code reference.

The output file is chosen by concatenating the parameter to --kotlin_out=, the package name (with periods [.] replaced with slashes [/]), and the suffix Kt.kt file name.

So, for example, let’s say you invoke the compiler as follows:

protoc --proto_path=src --java_out=build/gen/java --kotlin_out=build/gen/kotlin src/foo.proto

If foo.proto’s Java package is com.example and it contains a message named Bar, then the protocol buffer compiler will generate the file build/gen/kotlin/com/example/BarKt.kt. The protocol buffer compiler will automatically create the build/gen/kotlin/com and build/gen/kotlin/com/example directories if needed. However, it will not create build/gen/kotlin, build/gen, or build; they must already exist. You can specify multiple .proto files in a single invocation; all output files will be generated at once.

Messages

Given a simple message declaration:

message FooBar {}

The protocol buffer compiler generates—in addition to the generated Java code—an object called FooBarKt, as well as two top-level functions, having the following structure:

object FooBarKt {
  class Dsl private constructor { ... }
}
inline fun fooBar(block: FooBarKt.Dsl.() -> Unit): FooBar
inline fun FooBar.copy(block: FooBarKt.Dsl.() -> Unit): FooBar

Nested Types

A message can be declared inside another message. For example:

message Foo {
  message Bar { }
}

In this case, the compiler nests the BarKt object and the bar factory method inside FooKt, though the copy method remains top-level:

object FooKt {
  class Dsl { ... }
  object BarKt {
    class Dsl private constructor { ... }
  }
  inline fun bar(block: FooKt.BarKt.Dsl.() -> Unit): Foo.Bar
}
inline fun foo(block: FooKt.Dsl.() -> Unit): Foo
inline fun Foo.copy(block: FooKt.Dsl.() -> Unit): Foo
inline fun Foo.Bar.copy(block: FooKt.BarKt.Dsl.() -> Unit): Foo.Bar

Fields

In addition to the methods described in the previous section, the protocol buffer compiler generates mutable properties in the DSL for each field defined within the message in the .proto file. (Kotlin already infers read-only properties on the message object from the getters generated by Java.)

Note that properties always use camel-case naming, even if the field name in the .proto file uses lower-case with underscores (as it should). The case-conversion works as follows:

  1. For each underscore in the name, the underscore is removed, and the following letter is capitalized.
  2. If the name will have a prefix attached (for example, “clear”), the first letter is capitalized. Otherwise, it is lower-cased.

Thus, the field foo_bar_baz becomes fooBarBaz.

In a few special cases in which a field name conflicts with reserved words in Kotlin or methods already defined in the protobuf library, an extra underscore is appended. For instance, the clearer for a field named in is clearIn_().

Singular Fields (proto2)

For any of these field definitions:

optional int32 foo = 1;
required int32 foo = 1;

The compiler will generate the following accessors in the DSL:

  • fun hasFoo(): Boolean: Returns true if the field is set.
  • var foo: Int: The current value of the field. If the field is not set, returns the default value.
  • fun clearFoo(): Clears the value of the field. After calling this, hasFoo() will return false and getFoo() will return the default value.

For other simple field types, the corresponding Java type is chosen according to the scalar value types table. For message and enum types, the value type is replaced with the message or enum class. As the message type is still defined in Java, unsigned types in the message are represented using the standard corresponding signed types in the DSL, for compatibility with Java and older versions of Kotlin.

Embedded Message Fields

Note that there is no special handling of submessages. For example, if you have a field

optional Foo my_foo = 1;

you must write

myFoo = foo {
  ...
}

In general, this is because the compiler does not know whether Foo has a Kotlin DSL at all, or e.g. only has the Java APIs generated. This means that you do not have to wait for messages you depend on to add Kotlin code generation.

Singular Fields (proto3)

For this field definition:

int32 foo = 1;

The compiler will generate the following property in the DSL:

  • var foo: Int: Returns the current value of the field. If the field is not set, returns the default value for the field’s type.
  • fun clearFoo(): Clears the value of the field. After calling this, getFoo() will return the default value for the field’s type.

For other simple field types, the corresponding Java type is chosen according to the scalar value types table. For message and enum types, the value type is replaced with the message or enum class. As the message type is still defined in Java, unsigned types in the message are represented using the standard corresponding signed types in the DSL, for compatibility with Java and older versions of Kotlin.

Embedded Message Fields

For message field types, an additional accessor method is generated in the DSL:

  • boolean hasFoo(): Returns true if the field has been set.

Note that there is no shortcut for setting a submessage based on a DSL. For example, if you have a field

Foo my_foo = 1;

you must write

myFoo = foo {
  ...
}

In general, this is because the compiler does not know whether Foo has a Kotlin DSL at all, or e.g. only has the Java APIs generated. This means that you do not have to wait for messages you depend on to add Kotlin code generation.

Repeated Fields

For this field definition:

repeated string foo = 1;

The compiler will generate the following members in the DSL:

  • class FooProxy: DslProxy, an unconstructable type used only in generics
  • val fooList: DslList<String, FooProxy>, a read-only view of the list of current elements in the repeated field
  • fun DslList<String, FooProxy>.add(value: String), an extension function allowing elements to be added to the repeated field
  • operator fun DslList<String, FooProxy>.plusAssign(value: String), an alias for add
  • fun DslList<String, FooProxy>.addAll(values: Iterable<String>), an extension function allowing an Iterable of elements to be added to the repeated field
  • operator fun DslList<String, FooProxy>.plusAssign(values: Iterable<String>), an alias for addAll
  • operator fun DslList<String, FooProxy>.set(index: Int, value: String), an extension function setting the value of the element at the given zero-based inde
  • fun DslList<String, FooProxy>.clear(), an extension function clearing the contents of the repeated field

This unusual construction allows fooList to "behave like" a mutable list within the scope of the DSL, supporting only the methods supported by the underlying builder, while preventing mutability from "escaping" the DSL, which could cause confusing side effects.

For other simple field types, the corresponding Java type is chosen according to the scalar value types table. For message and enum types, the type is the message or enum class.

Oneof Fields

For this oneof field definition:

oneof oneof_name {
    int32 foo = 1;
    ...
}

The compiler will generate the following accessor methods in the DSL:

  • val oneofNameCase: OneofNameCase: gets which, if any, of the oneof_name fields are set; see the Java code reference for the return type
  • fun hasFoo(): Boolean (proto2 only): Returns true if the oneof case is FOO.
  • val foo: Int: Returns the current value of oneof_name if the oneof case is FOO. Otherwise, returns the default value of this field.

For other simple field types, the corresponding Java type is chosen according to the scalar value types table. For message and enum types, the value type is replaced with the message or enum class.

Map Fields

For this map field definition:

map<int32, int32> weight = 1;

The compiler will generate the following members in the DSL class:

  • class WeightProxy private constructor(): DslProxy(), an unconstructable type used only in generics
  • val weight: DslMap<Int, Int, WeightProxy>, a read-only view of the current entries in the map field
  • fun DslMap<Int, Int, WeightProxy>.put(key: Int, value: Int): add the entry to this map field
  • operator fun DslMap<Int, Int, WeightProxy>.put(key: Int, value: Int): alias for put using operator syntax
  • fun DslMap<Int, Int, WeightProxy>.remove(key: Int): removes the entry associated with key, if present
  • fun DslMap<Int, Int, WeightProxy>.putAll(map: Map<Int, Int>): adds all entries from the specified map to this map field, overwriting prior values for already present keys
  • fun DslMap<Int, Int, WeightProxy>.clear(): clears all entries from this map field

Extensions (proto2 only)

Given a message with an extension range:

message Foo {
  extensions 100 to 199;
}

The protocol buffer compiler will add the following methods to FooKt.Dsl:

  • operator fun <T> get(extension: ExtensionLite<Foo, T>): T: gets the current value of the extension field in the DSL
  • operator fun <T> get(extension: ExtensionLite<Foo, List<T>>): ExtensionList<T, Foo>: gets the current value of the repeated extension field in the DSL as a read-only List
  • operator fun <T : Comparable<T>> set(extension: ExtensionLite<Foo, T>): sets the current value of the extension field in the DSL (for Comparable field types)
  • operator fun <T : MessageLite> set(extension: ExtensionLite<Foo, T>): sets the current value of the extension field in the DSL (for message field types)
  • operator fun set(extension: ExtensionLite<Foo, ByteString>): sets the current value of the extension field in the DSL (for bytes fields)
  • operator fun contains(extension: ExtensionLite<Foo, *>): Boolean: returns true if the extension field has a value
  • fun clear(extension: ExtensionLite<Foo, *>): clears the extension field
  • fun <E> ExtensionList<Foo, E>.add(value: E): adds a value to the repeated extension field
  • operator fun <E> ExtensionList<Foo, E>.plusAssign(value: E): alias for add using operator syntax
  • operator fun <E> ExtensionList<Foo, E>.addAll(values: Iterable<E>): adds multiple values to the repeated extension field
  • operator fun <E> ExtensionList<Foo, E>.plusAssign(values: Iterable<E>): alias for addAll using operator syntax
  • operator fun <E> ExtensionList<Foo, E>.set(index: Int, value: E): sets the element of the repeated extension field at the specified index
  • inline fun ExtensionList<Foo, *>.clear(): clears the elements of the repeated extension field

The generics here are complex, but the effect is that this[extension] = value works for every extension type except repeated extensions, and repeated extensions have "natural" list syntax that works similarly to non-extension repeated fields.

Given an extension definition:

extend Foo {
  optional int32 bar = 123;
}

Java generates the "extension identifier" bar, which is used to "key" extension operations above.

Why Doesn’t Protobuf Support Nullable Setters/Getters?

We have heard feedback that some folks would like protobuf to support nullable getters/setters in their null-friendly language of choice (particularly Kotlin, C#, and Rust). While this does seem to be a helpful feature for folks using those languages, the design choice has tradeoffs which have led to the Protobuf team choosing not to implement them.

The biggest reason not to have nullable fields is the intended behavior of default values specified in a .proto file. By design, calling a getter on an unset field will return the default value of that field.

As an example, consider this .proto file:

message Msg { optional Child child = 1; }
message Child { optional Grandchild grandchild = 1; }
message Grandchild { optional int32 foo = 1 [default = 72]; }

and corresponding Kotlin getters:

// With our API where getters are always non-nullable:
msg.child.grandchild.foo == 72

// With nullable submessages the ?. operator fails to get the default value:
msg?.child?.grandchild?.foo == null

// Or verbosely duplicating the default value at the usage site:
(msg?.child?.grandchild?.foo ?: 72)

If a nullable getter existed, it would necessarily ignore the user-specified defaults (to return null instead) which would lead to surprising and inconsistent behavior. If users of nullable getters want to access the default value of the field, they would have to write their own custom handling to use the default if null is returned, which removes the supposed benefit of cleaner/easier code with null getters.

Similarly, we do not provide nullable setters as the behavior would be unintuitive. Performing a set and then get would not always give the same value back, and calling a set would only sometimes affect the has-bit for the field.

Note that message-typed fields are always explicit presence fields (with hazzers). Proto3 defaults to scalar fields having implicit presence (without hazzers) unless they are explicitly marked optional, while Proto2 does not support implicit presence. With Editions, explicit presence is the default behavior unless an implicit presence feature is used. With the forward expectation that almost all fields will have explicit presence, the ergonomic concerns that come with nullable getters are expected to be more of a concern than they may have been for Proto3 users.

Due to these issues, nullable setters/getters would radically change the way default values can be used. While we understand the possible utility, we have decided it’s not worth the inconsistencies and difficulty it introduces.