Protocol Buffers Edition 2023 Language Specification

Language specification reference for edition 2023 of the Protocol Buffers language.

The syntax is specified using Extended Backus-Naur Form (EBNF):

|   alternation
()  grouping
[]  option (zero or one time)
{}  repetition (any number of times)

Lexical Elements

Letters and Digits

letter = "A" ... "Z" | "a" ... "z"
capitalLetter =  "A" ... "Z"
decimalDigit = "0" ... "9"
octalDigit   = "0" ... "7"
hexDigit     = "0" ... "9" | "A" ... "F" | "a" ... "f"

Identifiers

ident = letter { letter | decimalDigit | "_" }
fullIdent = ident { "." ident }
messageName = ident
enumName = ident
fieldName = ident
oneofName = ident
mapName = ident
serviceName = ident
rpcName = ident
streamName = ident
messageType = [ "." ] { ident "." } messageName
enumType = [ "." ] { ident "." } enumName
groupName = capitalLetter { letter | decimalDigit | "_" }

Integer Literals

intLit     = decimalLit | octalLit | hexLit
decimalLit = [-] ( "1" ... "9" ) { decimalDigit }
octalLit   = [-] "0" { octalDigit }
hexLit     = [-] "0" ( "x" | "X" ) hexDigit { hexDigit }

Floating-point Literals

floatLit = [-] ( decimals "." [ decimals ] [ exponent ] | decimals exponent | "."decimals [ exponent ] ) | "inf" | "nan"
decimals  = [-] decimalDigit { decimalDigit }
exponent  = ( "e" | "E" ) [ "+" | "-" ] decimals

Boolean

boolLit = "true" | "false"

String Literals

strLit = strLitSingle { strLitSingle }
strLitSingle = ( "'" { charValue } "'" ) | ( '"' { charValue } '"' )
charValue = hexEscape | octEscape | charEscape | unicodeEscape | unicodeLongEscape | /[^\0\n\\]/
hexEscape = '\' ( "x" | "X" ) hexDigit [ hexDigit ]
octEscape = '\' octalDigit [ octalDigit [ octalDigit ] ]
charEscape = '\' ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | '\' | "'" | '"' )
unicodeEscape = '\' "u" hexDigit hexDigit hexDigit hexDigit
unicodeLongEscape = '\' "U" ( "000" hexDigit hexDigit hexDigit hexDigit hexDigit |
                              "0010" hexDigit hexDigit hexDigit hexDigit

EmptyStatement

emptyStatement = ";"

Constant

constant = fullIdent | ( [ "-" | "+" ] intLit ) | ( [ "-" | "+" ] floatLit ) |
                strLit | boolLit | MessageValue

MessageValue is defined in the Text Format Language Specification.

Edition

The edition statement replaces the legacy syntax keyword, and is used to define the edition that this file is using.

edition = "edition" "=" [ ( "'" decimalLit "'" ) | ( '"' decimalLit '"' ) ] ";"

Import Statement

The import statement is used to import another .proto’s definitions.

import = "import" [ "weak" | "public" ] strLit ";"

Example:

import public "other.proto";

Package

The package specifier can be used to prevent name clashes between protocol message types.

package = "package" fullIdent ";"

Example:

package foo.bar;

Option

Options can be used in proto files, messages, enums and services. An option can be a protobuf defined option or a custom option. For more information, see Options in the language guide. Options are also be used to control Feature Settings.

option = "option" optionName  "=" constant ";"
optionName = ( ident | "(" ["."] fullIdent ")" )

For examples:

option java_package = "com.example.foo";
option features.enum_type = CLOSED;

Fields

Fields are the basic elements of a protocol buffer message. Fields can be normal fields, group fields, oneof fields, or map fields. A field has a label, type and field number.

label = [ "repeated" ]
type = "double" | "float" | "int32" | "int64" | "uint32" | "uint64"
      | "sint32" | "sint64" | "fixed32" | "fixed64" | "sfixed32" | "sfixed64"
      | "bool" | "string" | "bytes" | messageType | enumType
fieldNumber = intLit;

Normal field

Each field has a label, type, name, and field number. It may have field options.

field = [label] type fieldName "=" fieldNumber [ "[" fieldOptions "]" ] ";"
fieldOptions = fieldOption { ","  fieldOption }
fieldOption = optionName "=" constant

Examples:

foo.bar nested_message = 2;
repeated int32 samples = 4 [packed=true];

Oneof and oneof field

A oneof consists of oneof fields and a oneof name. Oneof fields do not have labels.

oneof = "oneof" oneofName "{" { option | oneofField } "}"
oneofField = type fieldName "=" fieldNumber [ "[" fieldOptions "]" ] ";"

Example:

oneof foo {
    string name = 4;
    SubMessage sub_message = 9;
}

Map field

A map field has a key type, value type, name, and field number. The key type can be any integral or string type. Note, the key type may not be an enum.

mapField = "map" "<" keyType "," type ">" mapName "=" fieldNumber [ "[" fieldOptions "]" ] ";"
keyType = "int32" | "int64" | "uint32" | "uint64" | "sint32" | "sint64" |
          "fixed32" | "fixed64" | "sfixed32" | "sfixed64" | "bool" | "string"

Example:

map<string, Project> projects = 3;

Extensions and Reserved

Extensions and reserved are message elements that declare a range of field numbers or field names.

Extensions

Extensions declare that a range of field numbers in a message are available for third-party extensions. Other people can declare new fields for your message type with those numeric tags in their own .proto files without having to edit the original file.

extensions = "extensions" ranges ";"
ranges = range { "," range }
range =  intLit [ "to" ( intLit | "max" ) ]

Examples:

extensions 100 to 199;
extensions 4, 20 to max;

Reserved

Reserved declares a range of field numbers or names in a message or enum that can’t be used.

reserved = "reserved" ( ranges | reservedIdent ) ";"
fieldNames = fieldName { "," fieldName }

Examples:

reserved 2, 15, 9 to 11;
reserved foo, bar;

Top Level definitions

Enum definition

The enum definition consists of a name and an enum body. The enum body can have options, enum fields, and reserved statements.

enum = "enum" enumName enumBody
enumBody = "{" { option | enumField | emptyStatement | reserved } "}"
enumField = fieldName "=" [ "-" ] intLit [ "[" enumValueOption { ","  enumValueOption } "]" ]";"
enumValueOption = optionName "=" constant

Example:

enum EnumAllowingAlias {
  option allow_alias = true;
  EAA_UNSPECIFIED = 0;
  EAA_STARTED = 1;
  EAA_RUNNING = 2 [(custom_option) = "hello world"];
}

Message definition

A message consists of a message name and a message body. The message body can have fields, nested enum definitions, nested message definitions, extend statements, extensions, groups, options, oneofs, map fields, and reserved statements. A message cannot contain two fields with the same name in the same message schema.

message = "message" messageName messageBody
messageBody = "{" { field | enum | message | extend | extensions | group |
option | oneof | mapField | reserved | emptyStatement } "}"

Example:

message Outer {
  option (my_option).a = true;
  message Inner {   // Level 2
    required int64 ival = 1;
  }
  map<int32, string> my_map = 2;
  extensions 20 to 30;
}

None of the entities declared inside a message may have conflicting names. All of the following are prohibited:

message MyMessage {
  string foo = 1;
  message foo {}
}

message MyMessage {
  string foo = 1;
  oneof foo {
    string bar = 2;
  }
}

message MyMessage {
  string foo = 1;
  extend Extendable {
    string foo = 2;
  }
}

message MyMessage {
  string foo = 1;
  enum E {
    foo = 0;
  }
}

Extend

If a message in the same or imported .proto file has reserved a range for extensions, the message can be extended.

extend = "extend" messageType "{" {field | group} "}"

Example:

extend Foo {
  int32 bar = 126;
}

Service definition

service = "service" serviceName "{" { option | rpc | emptyStatement } "}"
rpc = "rpc" rpcName "(" [ "stream" ] messageType ")" "returns" "(" [ "stream" ]
messageType ")" (( "{" { option | emptyStatement } "}" ) | ";" )

Example:

service SearchService {
  rpc Search (SearchRequest) returns (SearchResponse);
}

Proto file

proto = [syntax] { import | package | option | topLevelDef | emptyStatement }
topLevelDef = message | enum | extend | service

An example .proto file:

edition = "2023";
import public "other.proto";
option java_package = "com.example.foo";
enum EnumAllowingAlias {
  option allow_alias = true;
  EAA_UNSPECIFIED = 0;
  EAA_STARTED = 1;
  EAA_RUNNING = 1;
  EAA_FINISHED = 2 [(custom_option) = "hello world"];
}
message Outer {
  option (my_option).a = true;
  message Inner {   // Level 2
    int64 ival = 1 [features.field_presence = LEGACY_REQUIRED];
  }
  repeated Inner inner_message = 2;
  EnumAllowingAlias enum_field = 3;
  map<int32, string> my_map = 4;
  extensions 20 to 30;
  reserved reserved_field;
}
message Foo {
  message GroupMessage {
    bool a = 1;
  }
  GroupMessage groupmessage = [features.message_encoding = DELIMITED];
}