Age | Commit message (Collapse) | Author |
|
Many improvements, too many to mention. One significant
perf regression warrants investigation:
omitfp.parsetoproto2_googlemessage1.upb_jit: 343 -> 252 (-26.53)
plain.parsetoproto2_googlemessage1.upb_jit: 334 -> 251 (-24.85)
25% regression for this benchmark is bad, but since I don't think
there's any fundamental design issue that caused it I'm going to
go ahead with the commit anyway. Can investigate and fix later.
Other benchmarks were neutral or showed slight improvement.
|
|
This breaks the open-source build, will
follow up with a change to fix it.
|
|
Added a upb_byteregion that tracks a region of
the input buffer; decoders use this instead of
using a upb_bytesrc directly. upb_byteregion
is also used as the way of passing a string to
a upb_handlers callback. This symmetry makes
decoders compose better; if you want to take
a parsed string and decode it as something else,
you can take the string directly from the callback
and feed it as input to another parser.
A commented-out version of a pinning interface
is present; I decline to actually implement it
(and accept its extra complexity) until/unless
it is clear that it is actually a win. But it
is included as a proof-of-concept, to show that
it fits well with the existing interface.
|
|
|
|
This might actually just bring to light my misuse of the upb_fielddef
functions. The test assertions are fine, but an assertion in upb/upb.h
fails:
./upb/upb.h:181: upb_value_getptr: Assertion `val.type == 33' failed.
|
|
|
|
|
|
|
|
|
|
|
|
This type was nothing but a map of defs.
We can as easily just pass an array of defs
into upb_symtab_add().
|
|
Includes are now via upb/foo.h.
Files specific to the protobuf format are
now in upb/pb (the core library is concerned
with message definitions, handlers, and
byte streams, but knows nothing about any
particular serializationf format).
|
|
|
|
Next on the chopping block is upb_string.
|
|
I'm realizing that basically all upb objects
will need to be refcounted to be sharable
across languages, but *not* messages which
are on their way out so we can get out of
the business of data representations.
Things which must be refcounted:
- encoders, decoders
- handlers objects
- defs
|
|
Startseq/endseq handlers are called at the beginning
and end of a sequence of repeated values. Protobuf
does not really have direct support for this (repeated
primitive fields do not delimit "begin" and "end" of
the sequence) but we can infer them from the bytestream.
The benefit of supporting them explicitly is that they
get their own stack frame and closure, so we can avoid
having to find the array's address over and over and
deciding if we need to initialize it.
This will also pave the way for better support of JSON,
which does have explicit "startseq/endseq" markers: [].
|
|
|
|
|
|
|
|
It can successfully parse SpeedMessage1.
Preliminary results: 750MB/s on Core2 2.4GHz.
This number is 2.5x proto2.
This isn't apples-to-apples, because
proto2 is parsing to a struct and we are
just doing stream parsing, but for apps
that are currently using proto2, this is the
improvement they would see if they could
move to stream-based processing.
Unfortunately perf-regression-test.py is
broken, and I'm not 100% sure why. It would
be nice to fix it first (to ensure that
there are no performance regressions for
the table-based decoder) but I'm really
impatient to get the JIT checked in.
|
|
Default values are now supported, and the Lua extension
can now create and modify individual protobuf objects.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The test currently triggers valgrind-detected memory errors.
|
|
There is currently a memory leak when type definitions
form cycles. This will need to be dealt with.
|
|
This should make it both easier to use and easier to
optimize, in exchange for a small amount of generality.
In practice, any remotely normal case is still very
natural.
|
|
of .h file.
|
|
Also delay deletion of subfields until the entire message is
deleted.
|
|
|
|
|
|
|
|
|
|
|
|
(want them inlined).
|
|
|
|
|
|
|
|
|