Age | Commit message (Collapse) | Author |
|
This is a significant change to the upb_stream
protocol, and should hopefully be the last
significant change.
All callbacks are now registered ahead-of-time
instead of having delegated callbacks registered
at runtime, which makes it much easier to
aggressively optimize ahead-of-time (like with a
JIT).
Other impacts of this change:
- You no longer need to have loaded descriptor.proto
as a upb_def to load other descriptors! This means
the special-case code we used for bootstrapping is
no longer necessary, and we no longer need to link
the descriptor for descriptor.proto into upb.
- A client can now register any upb_value as what
will be delivered to their value callback, not
just a upb_fielddef*. This should allow for other
clients to get more bang out of the streaming
decoder.
This change unfortunately causes a bit of a performance
regression -- I think largely due to highly
suboptimal code that GCC generates when structs
are returned by value. See:
http://blog.reverberate.org/2011/03/19/when-a-compilers-slow-code-actually-bites-you/
On the other hand, once we have a JIT this should
no longer matter.
Performance numbers:
plain.parsestream_googlemessage1.upb_table: 374 -> 396 (5.88)
plain.parsestream_googlemessage2.upb_table: 616 -> 449 (-27.11)
plain.parsetostruct_googlemessage1.upb_table_byref: 268 -> 269 (0.37)
plain.parsetostruct_googlemessage1.upb_table_byval: 215 -> 204 (-5.12)
plain.parsetostruct_googlemessage2.upb_table_byref: 307 -> 281 (-8.47)
plain.parsetostruct_googlemessage2.upb_table_byval: 297 -> 272 (-8.42)
omitfp.parsestream_googlemessage1.upb_table: 423 -> 410 (-3.07)
omitfp.parsestream_googlemessage2.upb_table: 679 -> 483 (-28.87)
omitfp.parsetostruct_googlemessage1.upb_table_byref: 287 -> 282 (-1.74)
omitfp.parsetostruct_googlemessage1.upb_table_byval: 226 -> 219 (-3.10)
omitfp.parsetostruct_googlemessage2.upb_table_byref: 315 -> 298 (-5.40)
omitfp.parsetostruct_googlemessage2.upb_table_byval: 297 -> 287 (-3.37)
|
|
|
|
|
|
It is slower than the C decoder for now because it
falls off the fast path too often. But it can
successfully decode varints, fixed32 and fixed64.
|
|
|
|
The compiler wasn't keeping upb_dstate in memory
anyway (which was the original goal). This
simplifies the decoder. upb_decode_fixed
was intended to minimize the number of branches,
but since it was calling out to memcpy as a
function, this turned out to be a pessimization.
Performance is encouraging:
plain32.parsestream_googlemessage1.upb_table: 254 -> 242 (-4.72)
plain32.parsestream_googlemessage2.upb_table: 357 -> 400 (12.04)
plain32.parsetostruct_googlemessage1.upb_table_byref: 143 -> 144 (0.70)
plain32.parsetostruct_googlemessage1.upb_table_byval: 122 -> 118 (-3.28)
plain32.parsetostruct_googlemessage2.upb_table_byref: 189 -> 200 (5.82)
plain32.parsetostruct_googlemessage2.upb_table_byval: 198 -> 200 (1.01)
omitfp32.parsestream_googlemessage1.upb_table: 267 -> 265 (-0.75)
omitfp32.parsestream_googlemessage2.upb_table: 377 -> 465 (23.34)
omitfp32.parsetostruct_googlemessage1.upb_table_byref: 140 -> 151 (7.86)
omitfp32.parsetostruct_googlemessage1.upb_table_byval: 131 -> 131 (0.00)
omitfp32.parsetostruct_googlemessage2.upb_table_byref: 204 -> 214 (4.90)
omitfp32.parsetostruct_googlemessage2.upb_table_byval: 200 -> 206 (3.00)
plain.parsestream_googlemessage1.upb_table: 313 -> 317 (1.28)
plain.parsestream_googlemessage2.upb_table: 476 -> 541 (13.66)
plain.parsetostruct_googlemessage1.upb_table_byref: 189 -> 189 (0.00)
plain.parsetostruct_googlemessage1.upb_table_byval: 165 -> 165 (0.00)
plain.parsetostruct_googlemessage2.upb_table_byref: 263 -> 270 (2.66)
plain.parsetostruct_googlemessage2.upb_table_byval: 248 -> 255 (2.82)
omitfp.parsestream_googlemessage1.upb_table: 306 -> 305 (-0.33)
omitfp.parsestream_googlemessage2.upb_table: 471 -> 531 (12.74)
omitfp.parsetostruct_googlemessage1.upb_table_byref: 189 -> 190 (0.53)
omitfp.parsetostruct_googlemessage1.upb_table_byval: 166 -> 172 (3.61)
omitfp.parsetostruct_googlemessage2.upb_table_byref: 258 -> 270 (4.65)
omitfp.parsetostruct_googlemessage2.upb_table_byval: 248 -> 265 (6.85)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|