Otherwise the application crash if the input string contains multibyte
characters. Here's the backtrace:
2> yamerl_constr:string("いろは").
** exception error: bad argument
in function re:run/3
called as re:run([12356,12429,12399],
"^(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$",
[{capture,none}])
in call from yamerl_node_float:string_to_float2/1 (src/yamerl_node_float.erl, line 145)
in call from yamerl_node_float:construct_token/3 (src/yamerl_node_float.erl, line 64)
in call from yamerl_node_float:try_construct_token/3 (src/yamerl_node_float.erl, line 54)
in call from yamerl_constr:try_construct/3 (src/yamerl_constr.erl, line 371)
in call from yamerl_constr:construct/2 (src/yamerl_constr.erl, line 315)
in call from yamerl_constr:string/2 (src/yamerl_constr.erl, line 203)
PR: #5
Reported by: Eric Sagnes (ericsagnes on GitHub)
This is required by rebar(1). ebin/Makefile.am is updated to modify the
"{vsn, ...}" tuple while creating ebin/yamerl.app.
Submitted by: Gleb Peregud (earlier version)
PR: #2
If this option is enabled, an "str" node will be returned as a binary()
(encoded as UTF-8) instead of a string().
One may specify the encoding, using the following tuple:
{str_node_as_binary, Encoding}
where Encoding is a unicode:encoding().
Note that the implementation differs from the yamler's one:
single-quoted flow scalars are not automatically converted to Erlang
atoms by yamerl because it's against the YAML 1.2 specification. The
specification states that only "?" non-specific should be "tasted" (ie.
plain flow scalars). Single-quoted and double quoted flow scalars as
well as block scalars have an implicit "!" non-specific tag and should
be interpreted as string.
See "§3.3.2 Resolved Tags" in the YAML 1.2 specification.
PR: #846
PR: #847
Using HiPE brings about 50% gain with the new benchmark test. With this,
this YAML parser is generally faster than the SAX parser from xmerl.
However, this is far from enough to compete with any JSON parser.
'acc': accumulate tokens in the parser's state. This was previously
available by not specifying a token function.
'drop': just drop tokens when they're ready.
Now, "normal" characters are treated in do_parse_flow_scalar/6 and
escaped characters are treated in do_parse_flow_scalar_escaped/6. The
"next_escaped" record member is gone. The performance gain is around
12% with the source used during the bench.
Instead, the next state is called directly. The current state is saved
in #yaml_parser only when we need to suspend the parsing. The
performance gain is around 5%
While here, we convert some functions to macros, to take advantage of
the optimization described in §3.5 of the Efficiency Guide.
This removes frequent updates of the buffer and line/column numbers in
this state. It's only synchronized before return. The performance gain
is about 16% with the source used during bench.