Age | Commit message (Collapse) | Author |
|
Includes are now via upb/foo.h.
Files specific to the protobuf format are
now in upb/pb (the core library is concerned
with message definitions, handlers, and
byte streams, but knows nothing about any
particular serializationf format).
|
|
|
|
Next on the chopping block is upb_string.
|
|
I'm realizing that basically all upb objects
will need to be refcounted to be sharable
across languages, but *not* messages which
are on their way out so we can get out of
the business of data representations.
Things which must be refcounted:
- encoders, decoders
- handlers objects
- defs
|
|
Startseq/endseq handlers are called at the beginning
and end of a sequence of repeated values. Protobuf
does not really have direct support for this (repeated
primitive fields do not delimit "begin" and "end" of
the sequence) but we can infer them from the bytestream.
The benefit of supporting them explicitly is that they
get their own stack frame and closure, so we can avoid
having to find the array's address over and over and
deciding if we need to initialize it.
This will also pave the way for better support of JSON,
which does have explicit "startseq/endseq" markers: [].
|
|
Now the dispatcher will call error handlers
instaed of returning statuses that the caller
has to constantly check.
|
|
|
|
|
|
|
|
|
|
This doesn't reflect any material change in
how I will be working on upb, and I have no
problem making this change. It's still open
source under the BSD license, and I'll still
be working on it well beyond the hours that
constitute a normal job.
|
|
This is a significant change to the upb_stream
protocol, and should hopefully be the last
significant change.
All callbacks are now registered ahead-of-time
instead of having delegated callbacks registered
at runtime, which makes it much easier to
aggressively optimize ahead-of-time (like with a
JIT).
Other impacts of this change:
- You no longer need to have loaded descriptor.proto
as a upb_def to load other descriptors! This means
the special-case code we used for bootstrapping is
no longer necessary, and we no longer need to link
the descriptor for descriptor.proto into upb.
- A client can now register any upb_value as what
will be delivered to their value callback, not
just a upb_fielddef*. This should allow for other
clients to get more bang out of the streaming
decoder.
This change unfortunately causes a bit of a performance
regression -- I think largely due to highly
suboptimal code that GCC generates when structs
are returned by value. See:
http://blog.reverberate.org/2011/03/19/when-a-compilers-slow-code-actually-bites-you/
On the other hand, once we have a JIT this should
no longer matter.
Performance numbers:
plain.parsestream_googlemessage1.upb_table: 374 -> 396 (5.88)
plain.parsestream_googlemessage2.upb_table: 616 -> 449 (-27.11)
plain.parsetostruct_googlemessage1.upb_table_byref: 268 -> 269 (0.37)
plain.parsetostruct_googlemessage1.upb_table_byval: 215 -> 204 (-5.12)
plain.parsetostruct_googlemessage2.upb_table_byref: 307 -> 281 (-8.47)
plain.parsetostruct_googlemessage2.upb_table_byval: 297 -> 272 (-8.42)
omitfp.parsestream_googlemessage1.upb_table: 423 -> 410 (-3.07)
omitfp.parsestream_googlemessage2.upb_table: 679 -> 483 (-28.87)
omitfp.parsetostruct_googlemessage1.upb_table_byref: 287 -> 282 (-1.74)
omitfp.parsetostruct_googlemessage1.upb_table_byval: 226 -> 219 (-3.10)
omitfp.parsetostruct_googlemessage2.upb_table_byref: 315 -> 298 (-5.40)
omitfp.parsetostruct_googlemessage2.upb_table_byval: 297 -> 287 (-3.37)
|
|
|
|
|
|
If a Lua program references the default message, it will
be copied into a mutable object.
|
|
Default values are now supported, and the Lua extension
can now create and modify individual protobuf objects.
|
|
It is slower than the C decoder for now because it
falls off the fast path too often. But it can
successfully decode varints, fixed32 and fixed64.
|
|
|
|
|
|
There is currently a memory leak when type definitions
form cycles. This will need to be dealt with.
|
|
The context owns a reference on each def, defs own references
on defs they reference, and msgs own refs on their def.
|
|
|
|
of .h file.
|
|
Also delay deletion of subfields until the entire message is
deleted.
|
|
|
|
|
|
|
|
|
|
The cost is that a upb_msg will now always have an overhead
of 2*sizeof(void*). This is comparable to proto2 overhead.
The benefit is that upb_msg is now self-describing, and
read-only algorithms can now operate on a upb_msg regardless
of the memory-management scheme.
Also, upb_array and upb_string now know inherently if they
own their associated memory, and upb_array has a generic
pointer for memory management purposes like upb_msg does.
|
|
There is significant refactoring here, as well as some more trivial
name changes. upb_msg has become upb_msgdef, to reflect the fact
that a upb_msg is not *itself* a message, it describes a message.
There are other renamings, such as upb_parse_state -> upb_stream_parser.
More significantly, the upb_msg class and parser have been refactored
to reflect my recent realization about how memory management should
work. upb_msg now has no memory management, and a memory mangement
scheme (that works beautifully with multiple language runtimes) will
be layered on top of it.
This iteration has the new, read-only upb_msg. upb_mm_msg (a
memory-managed message class) will come in the next change.
|
|
|
|
|
|
|
|
was a lot of work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|