Apache Avro is a serialization framework and a remote procedure call mechanism that generates serialization code for messages described in platform independent message description language. Individual language bindings are provided for multiple languages including C++, Java, Python, JavaScript.
The message schema can be declared in JSON or in an interface description language that maps into JSON. The schema supports a set of common primitive and constructed types with annotations for added information. The framework considers writer and reader schema independently and matches types and fields based on their names and conversion rules.
@namespace ("example")
protocol ExampleProtocol {
// An opaque type is in fact a byte array of a fixed length.
fixed LongLongInteger (8);
// Enum types are integers and can have default value.
enum SomeEnum { ONE, TWO, THREE } = THREE;
record SomeRecord {
// This is really only useful in unions.
null aNullField;
// The usual primitive types.
boolean aBoolean;
int aSigned32BitInteger;
long aSigned64BitInteger;
float aFloat = 0.0;
double aDouble;
bytes someBytes;
/** This is in fact a doc string. */
string
@aliases (["anOriginalString", "anEvenMoreOriginalString"])
aString = "This used to have some other names but not now.";
// Logical types are supported on top of some standard types.
@logicalType ("unsignedLong")
long anUnsigned64BitInteger;
@logicalType ("decimal")
@precision (8)
@scale (2)
bytes aFixedDecimalPointNumber;
@logicalType ("UUID")
string anUUID;
// Nested records with optional presence.
SomeRecord? nextRecord;
// The usual complex types.
array<string> aStringList = [];
map<int> aStringToIntMap = { "key" : 0xBAAD };
union { null, int } anotherOptionalDeclaration = null;
@deprecated (true)
int aDeprecatedInteger = 0;
LongLongInteger aLongLongInteger;
SomeEnum anEnumValue;
}
// Guess how this would get serialized :-)
record TestRecord { TostRecord r; }
record TostRecord { TestRecord r; }
}
All fields required or with default value
Typical basic and constructed types
Namespaces and anonymous types
Primary format is JSON
Empty (nothing written)
Single byte 0 (false) or 1 (true)
Sign optimized variable length encoding (zig-zag)
Standard IEEE 754 encoding
Length prefixed sequence of UTF-8 characters
Sequence of fields in schema order
Type index in schema order followed by value
Sequence of length prefixed blocks of array items terminated with 0 length block
Sequence of length prefixed blocks of map pairs terminated with 0 length block
The Apache Avro Project Home Page. https://avro.apache.org