From 3f562e118f263cf21e56bf4383a2278dab73278b Mon Sep 17 00:00:00 2001 From: miloyip Date: Wed, 15 Apr 2015 14:57:20 +0800 Subject: [PATCH] Fix "SSE 4.1 -> SSE 4.2" typo and add some comments about SIMD in internals and FAQ --- doc/faq.md | 2 +- doc/faq.zh-cn.md | 2 +- doc/features.md | 2 +- doc/features.zh-cn.md | 2 +- doc/internals.md | 16 +++++++++++++++- readme.md | 2 +- readme.zh-cn.md | 2 +- 7 files changed, 21 insertions(+), 7 deletions(-) diff --git a/doc/faq.md b/doc/faq.md index bab7f5f..85afcd3 100644 --- a/doc/faq.md +++ b/doc/faq.md @@ -198,7 +198,7 @@ 3. What is SIMD? How it is applied in RapidJSON? - [SIMD](http://en.wikipedia.org/wiki/SIMD) instructions can perform parallel computation in modern CPUs. RapidJSON support Intel's SSE2/SSE4.1 to accelerate whitespace skipping. This improves performance of parsing indent formatted JSON. + [SIMD](http://en.wikipedia.org/wiki/SIMD) instructions can perform parallel computation in modern CPUs. RapidJSON support Intel's SSE2/SSE4.2 to accelerate whitespace skipping. This improves performance of parsing indent formatted JSON. Define `RAPIDJSON_SSE2` or `RAPIDJSON_SSE42` macro to enable this feature. However, running the executable on a machine without such instruction set support will make it crash. 4. Does it consume a lot of memory? diff --git a/doc/faq.zh-cn.md b/doc/faq.zh-cn.md index ce6acf1..c6e7557 100644 --- a/doc/faq.zh-cn.md +++ b/doc/faq.zh-cn.md @@ -198,7 +198,7 @@ 3. 什是是SIMD?它如何用于RapidJSON? - [SIMD](http://en.wikipedia.org/wiki/SIMD)指令可以在现代CPU中执行并行运算。RapidJSON支持了Intel的SSE2/SSE4.1去加速跳过空白字符。在解析含缩进的JSON时,这能提升性能。 + [SIMD](http://en.wikipedia.org/wiki/SIMD)指令可以在现代CPU中执行并行运算。RapidJSON支持了Intel的SSE2/SSE4.2去加速跳过空白字符。在解析含缩进的JSON时,这能提升性能。只要定义名为`RAPIDJSON_SSE2`或`RAPIDJSON_SSE42`的宏,就能启动这个功能。然而,若在不支持这些指令集的机器上执行这些可执行文件,会导致崩溃。 4. 它会消耗许多内存么? diff --git a/doc/features.md b/doc/features.md index 15de259..fc54cd0 100644 --- a/doc/features.md +++ b/doc/features.md @@ -15,7 +15,7 @@ * High performance * Use template and inline functions to reduce function call overheads. * Internal optimized Grisu2 and floating point parsing implementations. - * Optional SSE2/SSE4.1 support. + * Optional SSE2/SSE4.2 support. ## Standard compliance diff --git a/doc/features.zh-cn.md b/doc/features.zh-cn.md index b56c424..3a01a4b 100644 --- a/doc/features.zh-cn.md +++ b/doc/features.zh-cn.md @@ -15,7 +15,7 @@ * 高性能 * 使用模版及内联函数去降低函数调用开销。 * 内部经优化的Grisu2及浮点数解析实现。 - * 可选的SSE2/SSE4.1支持。 + * 可选的SSE2/SSE4.2支持。 ## 符合标准 diff --git a/doc/internals.md b/doc/internals.md index ebfbc8f..de482cb 100644 --- a/doc/internals.md +++ b/doc/internals.md @@ -183,7 +183,21 @@ void SkipWhitespace(InputStream& s) { However, this requires 4 comparisons and a few branching for each character. This was found to be a hot spot. -To accelerate this process, SIMD was applied to compare 16 characters with 4 white spaces for each iteration. Currently RapidJSON only supports SSE2 and SSE4.1 instructions for this. And it is only activated for UTF-8 memory streams, including string stream or *in situ* parsing. +To accelerate this process, SIMD was applied to compare 16 characters with 4 white spaces for each iteration. Currently RapidJSON only supports SSE2 and SSE4.2 instructions for this. And it is only activated for UTF-8 memory streams, including string stream or *in situ* parsing. + +To enable this optimization, need to define `RAPIDJSON_SSE2` or `RAPIDJSON_SSE42` before including `rapidjson.h`. Some compilers can detect the setting, as in `perftest.h`: + +~~~cpp +// __SSE2__ and __SSE4_2__ are recognized by gcc, clang, and the Intel compiler. +// We use -march=native with gmake to enable -msse2 and -msse4.2, if supported. +#if defined(__SSE4_2__) +# define RAPIDJSON_SSE42 +#elif defined(__SSE2__) +# define RAPIDJSON_SSE2 +#endif +~~~ + +Note that, these are compile-time settings. Running the executable on a machine without such instruction set support will make it crash. ## Local Stream Copy {#LocalStreamCopy} diff --git a/readme.md b/readme.md index 59fd776..1314864 100644 --- a/readme.md +++ b/readme.md @@ -28,7 +28,7 @@ RapidJSON is a JSON parser and generator for C++. It was inspired by [RapidXml]( * RapidJSON is small but complete. It supports both SAX and DOM style API. The SAX parser is only a half thousand lines of code. -* RapidJSON is fast. Its performance can be comparable to `strlen()`. It also optionally supports SSE2/SSE4.1 for acceleration. +* RapidJSON is fast. Its performance can be comparable to `strlen()`. It also optionally supports SSE2/SSE4.2 for acceleration. * RapidJSON is self-contained. It does not depend on external libraries such as BOOST. It even does not depend on STL. diff --git a/readme.zh-cn.md b/readme.zh-cn.md index 569044b..cf26f49 100644 --- a/readme.zh-cn.md +++ b/readme.zh-cn.md @@ -16,7 +16,7 @@ RapidJSON是一个C++的JSON解析器及生成器。它的灵感来自[RapidXml] * RapidJSON小而全。它同时支持SAX和DOM风格的API。SAX解析器只有约500行代码。 -* RapidJSON快。它的性能可与`strlen()`相比。可支持SSE2/SSE4.1加速。 +* RapidJSON快。它的性能可与`strlen()`相比。可支持SSE2/SSE4.2加速。 * RapidJSON独立。它不依赖于BOOST等外部库。它甚至不依赖于STL。