Parsing JSON Really Quickly: Lessons Learned

Parsing JSON Really Quickly: Lessons Learned

InfoQ

4 года назад

75,519 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@willw2596
@willw2596 - 16.02.2024 03:25

This is an amazing presentation. Thank you!

Ответить
@whamer100
@whamer100 - 16.12.2023 17:26

GTA5 uses the inverse of this for its json parsing, it takes 6 minutes to parse a 10 megabyte json file for some ungodly reason

Ответить
@llothar68
@llothar68 - 06.05.2023 18:03

Now i wonder what magic a binary format could do with SIMD bit magic. I know Protocol_Buffer and some other like BJSON are not doing this.

Ответить
@KarlForner
@KarlForner - 21.02.2023 21:01

Very impressive, and nice speech. Thanks.

Ответить
@markthomas9641
@markthomas9641 - 26.03.2022 19:20

Great example, thank you. DeltaJSON is really useful if you are working with JSON, it does compare, merge and graft.

Ответить
@andrzejsupermocny2386
@andrzejsupermocny2386 - 11.11.2021 23:31

If anyone else is confused by slide 40, 41; then there is an error where the order should be brackets -> 1, comma ->2, colon ->4, most whitespace -> 8, whitespace -> 16, other -> 0. Then the math checks out. Had me scratching my head for a long time...

Ответить
@blackwhattack
@blackwhattack - 18.07.2021 12:33

If you want fast float serialization you should probably dump the in-memory representation, no?

Ответить
@yadancenick2671
@yadancenick2671 - 09.08.2020 03:40

"Our backend developers spend half of their time serializing and deserializing json."
From what i've been working around, JSON serializing and deserializing happens and only happens in the entrance or the exit of a request process. During the request process the "JSON" will be something like map/dict or Object. And in a typical backend server, the time saved by JSON serializing compares to DB calls will be almost negligible. But the magical branch-less parsing is so amazing and beautiful.

Ответить
@dielfonelletab8711
@dielfonelletab8711 - 01.04.2020 14:51

Now I wish I had a sticker!

Ответить
@yjhwang4246
@yjhwang4246 - 08.03.2020 22:07

Its really excellent approach. But I have no idea about how to check the cycles.. Is there any tool?

Ответить
@eformance
@eformance - 29.01.2020 04:30

If you have learned to program assembly language, all of these optimizations are second nature. Everyone should learn an assembly language and spend time hand optimizing their code. Learning assembly will make you a better programmer and influence your coding strategies in time-critical areas. That being said, optimizations often cause readability to suffer, so be wise about how you use coding strategies.

Ответить
@trinopoty
@trinopoty - 28.01.2020 23:36

I think if you need that level of performance, just ditch JSON in favor of something binary based like Protobuf.

Ответить
@fredg8328
@fredg8328 - 26.01.2020 17:05

Good old bitwise tricks. Reminds me how I coded 30 years ago.

Ответить
@Ryan-hp6xs
@Ryan-hp6xs - 26.01.2020 10:55

Or... stop using JSON and start using Protocol Buffers. Better in virtually every way.

Ответить
@PatrickKellyLoneCoder
@PatrickKellyLoneCoder - 25.01.2020 20:36

You sir, confirmed a theory I've had for the past year but was unable to work out: you absolutely can parse (at least partially) using SIMD. Which opens up the door to parsing with a GPU. Absolutely fantastic!

Ответить
@rudolphriedel541
@rudolphriedel541 - 25.01.2020 18:44

Getting rid of JSON might be a good idea then.

Ответить
@elerius2
@elerius2 - 22.01.2020 21:21

This is a great talk. Really clever optimizations.


However, it reminds me that we've fallen into a pit of failure because javascript is the lingua franca of the web. JSON is the fault of javascript. Easy to parse on the client because it can simply be "eval"ed to parse in the browser. Messages back to the server also require parsing. And the key here is that it requires *parsing*. A binary format, such as protocol buffers, is much more efficient for data transfer, both from a payload size and processing cost perspective, because it isn't *parsed*, per se, it is simply *read*. The problem is that javascript wasn't (isn't?) really capable of dealing with binary payloads, so binary solutions weren't, historically, an option.


Anyway, it makes me hope that the advent of WebAssembly will offer better solutions in the next few years. It is silly to me that great minds are wasting time optimizing things at this layer of abstraction, when the abstraction itself (JSON) is flawed.

Ответить
@smooth_lighting
@smooth_lighting - 22.01.2020 12:56

👌🏼

Ответить
@Voy2378
@Voy2378 - 21.01.2020 19:41

Zen2 also has AVX512

Ответить
- 20.01.2020 13:41

Very nice talk.

Ответить
@MrMcCoyD4
@MrMcCoyD4 - 20.01.2020 06:47

I love this! I never knew about carry-less multiplication. I also didn’t realize that disk speeds and CPU throughput were so similar nowadays

Ответить
@ChrisHinton42
@ChrisHinton42 - 19.01.2020 06:27

And the moral of the story is: don't use JSON

Ответить
@yohan9577
@yohan9577 - 18.01.2020 03:05

Impressive bitfields wizardry 😮 Thank you. This talk is giving me inspiration to rewrite a few things without branches, and learn more about "bitwise" intrinsic operators !

Ответить
@KaiHenningsen
@KaiHenningsen - 18.01.2020 02:09

Nice content, but somewhat painful to listen to.

Ответить
@ShadowTheAge
@ShadowTheAge - 17.01.2020 00:11

Aren't you supposed to use classes like LeftSquareBracketMatcherStrategyFactory?

Ответить
@platin2148
@platin2148 - 16.01.2020 23:32

You might be faster by not using the jvm which needs todo emulations of the real platform e.g use posix write/read instead of java‘s sh… and move to a language that was actually designed to be easier analyzed and optimized.

Ответить
@allanwind295
@allanwind295 - 16.01.2020 06:24

Interesting lecture with good mix of high-level strategy and low-level (pseudo) implementation details.

Ответить
@subschallenge-nh4xp
@subschallenge-nh4xp - 15.01.2020 12:52

I have started learning Jason this year can somebody tell me what's going on on the video

Ответить
@toddymikey
@toddymikey - 15.01.2020 12:43

Good lecture for programmers interested in thinking pragmatically rather than just abstractly.

Ответить