Age | Commit message (Collapse) | Author |
|
Now the dispatcher will call error handlers
instaed of returning statuses that the caller
has to constantly check.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It can successfully parse SpeedMessage1.
Preliminary results: 750MB/s on Core2 2.4GHz.
This number is 2.5x proto2.
This isn't apples-to-apples, because
proto2 is parsing to a struct and we are
just doing stream parsing, but for apps
that are currently using proto2, this is the
improvement they would see if they could
move to stream-based processing.
Unfortunately perf-regression-test.py is
broken, and I'm not 100% sure why. It would
be nice to fix it first (to ensure that
there are no performance regressions for
the table-based decoder) but I'm really
impatient to get the JIT checked in.
|
|
Simplified some of the semantics around
the decoder's data structures, in anticipation
of sharing them between the regular C decoder
and a JIT-ted decoder.
|
|
This allows us to remove one type check in the
critical path.
|
|
This doesn't reflect any material change in
how I will be working on upb, and I have no
problem making this change. It's still open
source under the BSD license, and I'll still
be working on it well beyond the hours that
constitute a normal job.
|
|
This is a significant change to the upb_stream
protocol, and should hopefully be the last
significant change.
All callbacks are now registered ahead-of-time
instead of having delegated callbacks registered
at runtime, which makes it much easier to
aggressively optimize ahead-of-time (like with a
JIT).
Other impacts of this change:
- You no longer need to have loaded descriptor.proto
as a upb_def to load other descriptors! This means
the special-case code we used for bootstrapping is
no longer necessary, and we no longer need to link
the descriptor for descriptor.proto into upb.
- A client can now register any upb_value as what
will be delivered to their value callback, not
just a upb_fielddef*. This should allow for other
clients to get more bang out of the streaming
decoder.
This change unfortunately causes a bit of a performance
regression -- I think largely due to highly
suboptimal code that GCC generates when structs
are returned by value. See:
http://blog.reverberate.org/2011/03/19/when-a-compilers-slow-code-actually-bites-you/
On the other hand, once we have a JIT this should
no longer matter.
Performance numbers:
plain.parsestream_googlemessage1.upb_table: 374 -> 396 (5.88)
plain.parsestream_googlemessage2.upb_table: 616 -> 449 (-27.11)
plain.parsetostruct_googlemessage1.upb_table_byref: 268 -> 269 (0.37)
plain.parsetostruct_googlemessage1.upb_table_byval: 215 -> 204 (-5.12)
plain.parsetostruct_googlemessage2.upb_table_byref: 307 -> 281 (-8.47)
plain.parsetostruct_googlemessage2.upb_table_byval: 297 -> 272 (-8.42)
omitfp.parsestream_googlemessage1.upb_table: 423 -> 410 (-3.07)
omitfp.parsestream_googlemessage2.upb_table: 679 -> 483 (-28.87)
omitfp.parsetostruct_googlemessage1.upb_table_byref: 287 -> 282 (-1.74)
omitfp.parsetostruct_googlemessage1.upb_table_byval: 226 -> 219 (-3.10)
omitfp.parsetostruct_googlemessage2.upb_table_byref: 315 -> 298 (-5.40)
omitfp.parsetostruct_googlemessage2.upb_table_byval: 297 -> 287 (-3.37)
|
|
|
|
|
|
|
|
Default values are now supported, and the Lua extension
can now create and modify individual protobuf objects.
|
|
|
|
The symtab that contains them is now hidden, and
you can look them up by name but there is no access
to the symtab itself, so there is no risk of
mutating it (by extending it, adding other defs
to it, etc).
|
|
|
|
It is slower than the C decoder for now because it
falls off the fast path too often. But it can
successfully decode varints, fixed32 and fixed64.
|
|
upb_inttable() now supports a "compact" operation that will
decide on an array size and put all entries with small enough
keys into the array part for faster lookup.
Also exposed the upb_itof_ent structure and put a few useful
values there, so they are one fewer pointer chase away.
|
|
|
|
The compiler wasn't keeping upb_dstate in memory
anyway (which was the original goal). This
simplifies the decoder. upb_decode_fixed
was intended to minimize the number of branches,
but since it was calling out to memcpy as a
function, this turned out to be a pessimization.
Performance is encouraging:
plain32.parsestream_googlemessage1.upb_table: 254 -> 242 (-4.72)
plain32.parsestream_googlemessage2.upb_table: 357 -> 400 (12.04)
plain32.parsetostruct_googlemessage1.upb_table_byref: 143 -> 144 (0.70)
plain32.parsetostruct_googlemessage1.upb_table_byval: 122 -> 118 (-3.28)
plain32.parsetostruct_googlemessage2.upb_table_byref: 189 -> 200 (5.82)
plain32.parsetostruct_googlemessage2.upb_table_byval: 198 -> 200 (1.01)
omitfp32.parsestream_googlemessage1.upb_table: 267 -> 265 (-0.75)
omitfp32.parsestream_googlemessage2.upb_table: 377 -> 465 (23.34)
omitfp32.parsetostruct_googlemessage1.upb_table_byref: 140 -> 151 (7.86)
omitfp32.parsetostruct_googlemessage1.upb_table_byval: 131 -> 131 (0.00)
omitfp32.parsetostruct_googlemessage2.upb_table_byref: 204 -> 214 (4.90)
omitfp32.parsetostruct_googlemessage2.upb_table_byval: 200 -> 206 (3.00)
plain.parsestream_googlemessage1.upb_table: 313 -> 317 (1.28)
plain.parsestream_googlemessage2.upb_table: 476 -> 541 (13.66)
plain.parsetostruct_googlemessage1.upb_table_byref: 189 -> 189 (0.00)
plain.parsetostruct_googlemessage1.upb_table_byval: 165 -> 165 (0.00)
plain.parsetostruct_googlemessage2.upb_table_byref: 263 -> 270 (2.66)
plain.parsetostruct_googlemessage2.upb_table_byval: 248 -> 255 (2.82)
omitfp.parsestream_googlemessage1.upb_table: 306 -> 305 (-0.33)
omitfp.parsestream_googlemessage2.upb_table: 471 -> 531 (12.74)
omitfp.parsetostruct_googlemessage1.upb_table_byref: 189 -> 190 (0.53)
omitfp.parsetostruct_googlemessage1.upb_table_byval: 166 -> 172 (3.61)
omitfp.parsetostruct_googlemessage2.upb_table_byref: 258 -> 270 (4.65)
omitfp.parsetostruct_googlemessage2.upb_table_byval: 248 -> 265 (6.85)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|