upb.git - a small protobuf implementation in C

Age	Commit message (Collapse)	Author
2011-07-15	Directory restructure.	Joshua Haberman
	Includes are now via upb/foo.h. Files specific to the protobuf format are now in upb/pb (the core library is concerned with message definitions, handlers, and byte streams, but knows nothing about any particular serializationf format).
2011-07-14	Major refactoring: upb_string is gone in favor of upb_strref.	Joshua Haberman

2011-06-17	Major refactoring: abandon upb_msg, add upb_accessors.	Joshua Haberman
	Next on the chopping block is upb_string.
2011-05-21	Make all handlers objects refcounted.	Joshua Haberman
	I'm realizing that basically all upb objects will need to be refcounted to be sharable across languages, but not messages which are on their way out so we can get out of the business of data representations. Things which must be refcounted: - encoders, decoders - handlers objects - defs
2011-05-19	Change dispatcher error handling model.	Joshua Haberman
	Now the dispatcher will call error handlers instaed of returning statuses that the caller has to constantly check.
2011-05-08	Decoder redesign in preparation for packed fields and start/endseq.	Joshua Haberman

2011-04-04	Speed up parsetostruct by using type-specialized callbacks.	Joshua Haberman

2011-04-01	First rough version of the JIT.	Joshua Haberman
	It can successfully parse SpeedMessage1. Preliminary results: 750MB/s on Core2 2.4GHz. This number is 2.5x proto2. This isn't apples-to-apples, because proto2 is parsing to a struct and we are just doing stream parsing, but for apps that are currently using proto2, this is the improvement they would see if they could move to stream-based processing. Unfortunately perf-regression-test.py is broken, and I'm not 100% sure why. It would be nice to fix it first (to ensure that there are no performance regressions for the table-based decoder) but I'm really impatient to get the JIT checked in.
2011-03-20	Update copyright to be Google Inc.	Josh Haberman
	This doesn't reflect any material change in how I will be working on upb, and I have no problem making this change. It's still open source under the BSD license, and I'll still be working on it well beyond the hours that constitute a normal job.
2011-03-20	upb_stream: all callbacks registered ahead-of-time.	Josh Haberman
	This is a significant change to the upb_stream protocol, and should hopefully be the last significant change. All callbacks are now registered ahead-of-time instead of having delegated callbacks registered at runtime, which makes it much easier to aggressively optimize ahead-of-time (like with a JIT). Other impacts of this change: - You no longer need to have loaded descriptor.proto as a upb_def to load other descriptors! This means the special-case code we used for bootstrapping is no longer necessary, and we no longer need to link the descriptor for descriptor.proto into upb. - A client can now register any upb_value as what will be delivered to their value callback, not just a upb_fielddef*. This should allow for other clients to get more bang out of the streaming decoder. This change unfortunately causes a bit of a performance regression -- I think largely due to highly suboptimal code that GCC generates when structs are returned by value. See: http://blog.reverberate.org/2011/03/19/when-a-compilers-slow-code-actually-bites-you/ On the other hand, once we have a JIT this should no longer matter. Performance numbers: plain.parsestream_googlemessage1.upb_table: 374 -> 396 (5.88) plain.parsestream_googlemessage2.upb_table: 616 -> 449 (-27.11) plain.parsetostruct_googlemessage1.upb_table_byref: 268 -> 269 (0.37) plain.parsetostruct_googlemessage1.upb_table_byval: 215 -> 204 (-5.12) plain.parsetostruct_googlemessage2.upb_table_byref: 307 -> 281 (-8.47) plain.parsetostruct_googlemessage2.upb_table_byval: 297 -> 272 (-8.42) omitfp.parsestream_googlemessage1.upb_table: 423 -> 410 (-3.07) omitfp.parsestream_googlemessage2.upb_table: 679 -> 483 (-28.87) omitfp.parsetostruct_googlemessage1.upb_table_byref: 287 -> 282 (-1.74) omitfp.parsetostruct_googlemessage1.upb_table_byval: 226 -> 219 (-3.10) omitfp.parsetostruct_googlemessage2.upb_table_byref: 315 -> 298 (-5.40) omitfp.parsetostruct_googlemessage2.upb_table_byval: 297 -> 287 (-3.37)
2011-02-23	Added proper support for enum default values.	Joshua Haberman

2011-02-18	Bring lua extension up to date with new symtab APIs.	Joshua Haberman

2011-02-17	First version of an assembly language decoder.	Joshua Haberman
	It is slower than the C decoder for now because it falls off the fast path too often. But it can successfully decode varints, fixed32 and fixed64.
2011-02-17	Split inttable into a hash part and an array part.	Joshua Haberman
	upb_inttable() now supports a "compact" operation that will decide on an array size and put all entries with small enough keys into the array part for faster lookup. Also exposed the upb_itof_ent structure and put a few useful values there, so they are one fewer pointer chase away.
2011-02-13	Merged core/ and stream/ -> src/. The split wasn't worth it.	Joshua Haberman

2010-07-09	Split src/ into core/ and stream/.	Joshua Haberman

2010-07-09	Strip out some stuff that's not currently being used.	Joshua Haberman

2010-07-09	Dynamically allocate string for error msg.	Joshua Haberman

2010-06-10	Implement proper type checking again.	Joshua Haberman

2010-06-09	Decoder compiler but doesn't work yet.	Joshua Haberman

2010-06-09	More decoder work, first attempts at compiling it.	Joshua Haberman

2010-06-05	More work on the decoder.	Joshua Haberman

2010-06-03	More incremental work.	Joshua Haberman

2010-06-03	WIP: intrusive changes to upb_decoder.	Joshua Haberman

2010-05-24	Defined the upb_src and upb_bytesrc interfaces.	Joshua Haberman

2010-01-16	Flesh out implementation of upb_sizebuilder.	Joshua Haberman

2010-01-16	Removed union tag from types.	Joshua Haberman

2010-01-15	Remove struct keyword from all types, use typedef instead.	Joshua Haberman

2010-01-13	Incremental work on serialization.	Joshua Haberman

2010-01-04	upb_array -> upb_arrayptr.	Joshua Haberman

2010-01-02	Move string representations back upb.h -> upb_data.h.	Joshua Haberman

2010-01-02	upb_string* -> upb_strptr, to follow aliasing rules.	Joshua Haberman

2009-12-31	upbc compiles and links! But probably doesn't work yet.	Joshua Haberman

2009-12-29	Getting closer, only a few functions undefined now.	Joshua Haberman

2009-12-28	More incremental work; ported some of upbc.	Joshua Haberman

2009-12-06	Truly fixed type cyclic refcounting.	Joshua Haberman

2009-12-06	Circular references truly work now, along with a test.	Joshua Haberman
	One simplification to come.
2009-12-05	Scheme for collecting circular refs.	Joshua Haberman
	"make descriptorgen" is now valgrind-clean again.
2009-12-05	Make defs refcounted, rename upb_context->upbsymtab.	Joshua Haberman
	There is currently a memory leak when type definitions form cycles. This will need to be dealt with.
2009-11-27	WIP of cleaning up defs.	Joshua Haberman

2009-11-26	Make upb_msgdef own all its data.	Joshua Haberman
	This is in anticipation of making upb_msgdef's easy to dup. This involved removing all traces of any descriptors from the defs.
2009-11-14	Changed parse API to know about msgdefs.	Joshua Haberman
	This should make it both easier to use and easier to optimize, in exchange for a small amount of generality. In practice, any remotely normal case is still very natural.
2009-11-14	Renamed upb_msg_fielddef -> upb_fielddef, upb_enum -> upb_enumdef.	Joshua Haberman

2009-09-26	Use a status object for errors so a message can be returned.	Joshua Haberman
	Also delay deletion of subfields until the entire message is deleted.
2009-08-27	Some cleanup and reformatting, fixed the benchmarks.	Joshua Haberman

2009-08-24	Significant memory-management refactoring any Python extension.	Joshua Haberman

2009-08-12	Refactoring: unify upb_msg.	Joshua Haberman
	The cost is that a upb_msg will now always have an overhead of 2sizeof(void). This is comparable to proto2 overhead. The benefit is that upb_msg is now self-describing, and read-only algorithms can now operate on a upb_msg regardless of the memory-management scheme. Also, upb_array and upb_string now know inherently if they own their associated memory, and upb_array has a generic pointer for memory management purposes like upb_msg does.
2009-08-07	Major refactoring of upb_msg. Temporary functionality regression.	Joshua Haberman
	There is significant refactoring here, as well as some more trivial name changes. upb_msg has become upb_msgdef, to reflect the fact that a upb_msg is not itself a message, it describes a message. There are other renamings, such as upb_parse_state -> upb_stream_parser. More significantly, the upb_msg class and parser have been refactored to reflect my recent realization about how memory management should work. upb_msg now has no memory management, and a memory mangement scheme (that works beautifully with multiple language runtimes) will be layered on top of it. This iteration has the new, read-only upb_msg. upb_mm_msg (a memory-managed message class) will come in the next change.
2009-07-29	Header file rearranging/prettifying.	Joshua Haberman

2009-07-22	Compiler finally works (except string arrays). Untested. Holy crap that ↵	Joshua Haberman
	was a lot of work.