[Development] API review request: CBOR Stream reader and writer

Discussion:

Thiago Macieira

2018-01-17 21:25:53 UTC

Hello

I finished writing the documentation for the two basic classes for CBOR. You
can find them in reviews
https://codereview.qt-project.org/107465
https://codereview.qt-project.org/107466

Please review. I will take a couple more days writing the docs for
QCbor{Value,Map,Array} and then I'll upload that too.

I'm also interested in what I could write as an example. Please send
suggestions.

My current idea is a command-line tool that converts between serialisation
formats:
* CBOR
* CBOR diagnostic notation (output only, since I won't write the parser)
* JSON
* XML
* Qt binary JSON
* Plain QDataStream (output only, since it's not self-describing)
* QDataStream-serialised QVariant (is self-describing)

Though, because of the conversions, this example is ideal for QCborValue, not
the stream reader and writer.

Another idea is to update the network-chat example to use CBOR instead of its
plaintext protocol. In this one, I could use the stream reader and writer.
This example is a perfect candidate to have a CoAP version in the future too.

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-18 03:22:36 UTC

Permalink

Post by Thiago Macieira
Another idea is to update the network-chat example to use CBOR instead of
its plaintext protocol. In this one, I could use the stream reader and
writer. This example is a perfect candidate to have a CoAP version in the
future too.

I've done this then:
https://codereview.qt-project.org/217078
https://codereview.qt-project.org/217079

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Dominik Holland

2018-01-18 08:51:18 UTC

Permalink

Post by Thiago Macieira

https://codereview.qt-project.org/217078
https://codereview.qt-project.org/217079

Why not keep the existing example and do a network-chat-cbor example
which is more advanced ?

In the example you could also explain why someone wants to use CBOR.

Dominik

Thiago Macieira

2018-01-18 15:44:48 UTC

Permalink

Post by Dominik Holland

Post by Thiago Macieira

https://codereview.qt-project.org/217078
https://codereview.qt-project.org/217079

Why not keep the existing example and do a network-chat-cbor example
which is more advanced ?

I can do that if people think it would be better. I hadn't thought of it, so I
just modified the example.

Post by Dominik Holland
In the example you could also explain why someone wants to use CBOR.

The difference in the state machine for receiving incomplete data should be
enough of a reason. Plus the fact that it no longer needs to parse strings for
numbers.

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-22 09:37:40 UTC

Permalink

Post by Thiago Macieira
My current idea is a command-line tool that converts between serialisation
* CBOR
* CBOR diagnostic notation (output only, since I won't write the parser)
* JSON
* XML
* Qt binary JSON
* Plain QDataStream (output only, since it's not self-describing)
* QDataStream-serialised QVariant (is self-describing)
Though, because of the conversions, this example is ideal for QCborValue,
not the stream reader and writer.

Done: https://codereview.qt-project.org/217410
I'd really appreciate if someone reviewed my code that parses XML.

I've also used this example application to benchmark the various aspects of
the encoder. It's helped me find a couple of bottlenecks in the
implementation, which led to a redesign of the string parsing in
QCborStreamReader.

The current numbers, for parsing an array with 5000 entries of a map, the
contents of which were obtained by:
qtplugininfo --full-json /usr/lib64/qt5/plugins/akregator_config_advanced.so

Binary JSON validating:
97,003559 task-clock:u (msec)
237.092.857 cycles
437.005.872 instructions
[15.2% was spent in QIODevice::readAll, 59.1% in fromBinaryData]

JSON parsing:
273,359723 task-clock:u (msec)
793.297.513 cycles
2.698.607.303 instructions
[4.7% in readAll(), 78.2% in fromJson]

CBOR parsing:
341,311535 task-clock
885.053.081 cycles
2.548.803.851 instructions

The string parser is still showing up at 70.5% of the full execution time, of
which 33.4% are in QCborStreamReader and 20.4% calling isValidUtf8(). The
program spends 25,0% inside QIODevice, inside the string decoder. Unlike the
JSON parser, we don't operate on a pre-read byte array, but directly on the
QIODevice, checking for size.

The JSON parser spends 56.2% of the full execution time parsing strings.

As for the encoders, the test is done by reading from Binary JSON, converting
to QVariant, then back from QVariant and then saving.

Binary JSON (baseline):
724,619527 task-clock
1.792.421.866 cycles
2.983.986.222 instructions
Time spent in toBinaryData: 1.24%

JSON:
1150,128441 task-clock:u (msec)
3.179.240.094 cycles
6.673.262.299 instructions
Time spent in the encoder: 34.5%, so ~403 ms

CBOR:
930,697635 task-clock:u (msec)
2.391.326.016 cycles
4.910.714.973 instructions
Time spent in the encoder: 21.2%, so 176 ms

File sizes:
Binary JSON: 55,540,020 bytes (546 MB/s on read, 5900 MB/s write)
JSON: 57,580,003 bytes (201 MB/s read, 136 MB/s write)
CBOR: 41,200,002 bytes (115 MB/s read, 223 MB/s write)

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-22 18:39:51 UTC

Permalink

Post by Thiago Macieira
341,311535 task-clock
885.053.081 cycles
2.548.803.851 instructions
The string parser is still showing up at 70.5% of the full execution time,
of which 33.4% are in QCborStreamReader and 20.4% calling isValidUtf8().
The program spends 25,0% inside QIODevice, inside the string decoder.
Unlike the JSON parser, we don't operate on a pre-read byte array, but
directly on the QIODevice, checking for size.

I've further optimised isValidUtf8() and now the string parser is only 61% of
the execution time, with isValidUtf8() down to 7.7%.

https://codereview.qt-project.org/217084 for the optimisations.

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-23 00:56:56 UTC

Permalink

Post by Thiago Macieira
97,003559 task-clock:u (msec)
237.092.857 cycles
437.005.872 instructions
[15.2% was spent in QIODevice::readAll, 59.1% in fromBinaryData]
273,359723 task-clock:u (msec)
793.297.513 cycles
2.698.607.303 instructions
[4.7% in readAll(), 78.2% in fromJson]
341,311535 task-clock
885.053.081 cycles
2.548.803.851 instructions

After fixing the converter example to properly use mmap in all three cases,
plus refactoring the CBOR parser to operate on a pre-loaded array and use
larger buffer than single-digit byte counts, the numbers are:

Binary JSON:
69,844846 task-clock:u (msec)
196.906.259 cycles:u
422.255.714 instructions:u
[There's no readAll(); 70.2% of the time is spent inside
QJsonPrivate::Object::isValid]

JSON:
255,809132 task-clock:u (msec)
771.771.000 cycles:u
2.690.966.058 instructions:u
[80.2% inside QJsonPrivate::Parser::parseValue, 58.7% inside
QJsonPrivate::Parser::parseString and 16.3% inside QUtf8Functions::fromUtf8]

CBOR:
239,059121 task-clock:u (msec)
562.474.857 cycles:u
1.431.590.428 instructions:u
[71.6% inside QCborValue::fromCbor, 65.0% inside
QCborContainerPrivate::decodeStringFromCbor, 25.5% inside
QCborStreamReader::readStringChunk plus 12.6% inside QUtf8::isValidUtf8]

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-23 01:45:09 UTC

Permalink

Post by Thiago Macieira
After fixing the converter example to properly use mmap in all three cases,
plus refactoring the CBOR parser to operate on a pre-loaded array and use

A major difference between the three is the memory allocation strategy. The
Binary JSON needs none, since it uses an in-memory format for the on-disk
format. So the application runs with 53880 kB of the mmapped file in a total
of 64384 kB RSS.

The parsed QJsonDocument uses the same allocation strategy, which one if its
greatest advantages. It allocates a buffer equal to the size of the JSON
input, so after unloading the 55872 kB of the source file, it keeps a huge
block of 55792 kB allocated, in a total of 65928 kB RSS.

A major drawback of this is that the Binary JSON memory format is limited to
128 MB in total. There's a bug report about this and changing the format to
raise this limit, even for in-memory, will change the numbers above.

QCborValue does no such thing, but it's also not limited. Since it uses
regular memory allocation, after unloading the 40244 kB file, it needs 110356
kB of heap, for a total of 120428 kB RSS.

For comparison, loading the same data from a QDataStream and keeping in a
QVariant requires 233944 kB of heap, for a total of 244088 kB RSS. The
QDataStream file is 76156 kB and that includes all strings stored as UTF-16,
so it's not the UTF-16 vs UTF-8 vs Latin1 that accounts for the majority of
the overhead.

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-24 18:49:32 UTC

Permalink

Post by Thiago Macieira
I'm also interested in what I could write as an example. Please send
suggestions.

I've added a CBOR dump tool example, which can help you with understanding
what you have:

$ convert/convert -o line-wrap=no /tmp/test3.cbor
{ "Hello": false, "World": 1.2, "foo": [ 1.0, 2.0, 3.0 ] }
$ convert/convert /tmp/test3.cbor
{
"Hello": false,
"World": 1.2,
"foo": [
1.0,
2.0,
3.0
]
}
$ cbordump/cbordump --compact /tmp/test3.cbor
55799({ "Hello": false, "World": 1.2, "foo": [ 1.0f, 2.0f, 3.0f ] })
$ cbordump/cbordump /tmp/test3.cbor
55799({
"Hello": false,
"World": 1.2,
"foo": [
1.0f,
2.0f,
3.0f
]
})

You may be asking why the two tools print different output. What's that 55799
and why do 1, 2 and 3 have "f" at the end? Well, the use the annotated mode:

$ cbordump/cbordump --annotated /tmp/test3.cbor
d9 d9 f7 # Tag 55799 (Self-describe
CBOR; see Section 2.4.5 [RFC7049])
a3 # Map length 3
65 # Text string length 5
48 65 6c 6c 6f # "Hello"
f4 # Simple Type false
65 # Text string length 5
57 6f 72 6c 64 # "World"
fb 3f f3 33 33 33 33 33 33 # Double 1.2
63 # Text string length 3
66 6f 6f # "foo"
83 # Array length 3
fa 3f 80 00 00 # Float 1.
fa 40 00 00 00 # Float 2.
fa 40 40 00 00 # Float 3.

That 55799 allows this:
$ file /tmp/test3.cbor
/tmp/test3.cbor: Concise Binary Object Representation (CBOR) container

The "f" indicate that it's a float, not double.

And just because I can:

$ qtplugininfo --full-json $QTOBJDIR/plugins/sqldrivers/libqsqlmysql.so | \
./convert/convert -O cbor -o signature=no | \
./cbordump/cbordump --annotated
a5 # Map length 5
63 # Text string length 3
49 49 44 # "IID"
78 2c # Text string length 44
6f 72 67 2e 71 74 2d 70 72 6f 6a 65 63 74 2e # "org.qt-project."
51 74 2e 51 53 71 6c 44 72 69 76 65 72 46 61 # "Qt.QSqlDriverFa"
63 74 6f 72 79 49 6e 74 65 72 66 61 63 65 # "ctoryInterface"
68 # Text string length 8
4d 65 74 61 44 61 74 61 # "MetaData"
a1 # Map length 1
64 # Text string length 4
4b 65 79 73 # "Keys"
82 # Array length 2
67 # Text string length 7
51 4d 59 53 51 4c 33 # "QMYSQL3"
66 # Text string length 6
51 4d 59 53 51 4c # "QMYSQL"
69 # Text string length 9
63 6c 61 73 73 4e 61 6d 65 # "className"
72 # Text string length 18
51 4d 59 53 51 4c 44 72 69 76 65 72 50 6c 75 # "QMYSQLDriverPlu"
67 69 6e # "gin"
65 # Text string length 5
64 65 62 75 67 # "debug"
f5 # Simple Type true
67 # Text string length 7
76 65 72 73 69 6f 6e # "version"
1a 00 05 0a 00 # Unsigned integer 0x50a00

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center

Thiago Macieira

2018-01-24 19:45:00 UTC

Permalink

Post by Thiago Macieira
I've added a CBOR dump tool example, which can help you with understanding

One more difference: since "convert" uses QCborValue, it will normalise a few
items:

$ ./convert/convert /tmp/test3.cbor
[
32("https://example.com/© "),
0("2018-01-10T06:24:37.000Z")
]
$ ./cbordump/cbordump /tmp/test3.cbor
[
32("HTTPS://EXAMPLE.COM/%C2%A9 "),
1(1515565477)
]

Note how there's information loss if you try to convert this to JSON:
$ ./convert/convert -O json /tmp/test3.cbor
[
"https://example.com/%C2%A9%20",
"2018-01-10T06:24:37.000Z"
]
[This was after change https://codereview.qt-project.org/217854 to QJsonValue]

$ ./cbordump/cbordump --annotated /tmp/test3.cbor
82 # Array length 2
d8 20 # Tag 32 (URI; see Section
2.4.4.3 [RFC7049])
78 1b # Text string length 27
48 54 54 50 53 3a 2f 2f 45 58 41 4d 50 4c # "HTTPS://EXAMPL"
45 2e 43 4f 4d 2f 25 43 32 25 41 39 20 # "E.COM/%C2%A9 "
c1 # Tag 1 (Epoch-based date/
time; see Section 2.4.1 [RFC7049])
1a 5a 55 b1 a5 # Unsigned integer
0x5a55b1a5

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center