core/oga - oga

Commit Graph

Author	SHA1	Message	Date
Yorick Peterse	94f8ed5421	Removed start/end comments of YARD blocks	2015-09-01 19:59:52 +02:00
Yorick Peterse	b6d34a406d	Removed redundant returns	2015-06-16 00:51:51 +02:00
Yorick Peterse	2766d5f27f	Pack HTML entities using "U*" See https://github.com/YorickPeterse/oga/issues/90#issuecomment-89859273 for more details, apparently I didn't fix this before.	2015-05-21 11:26:01 +02:00
Yorick Peterse	2ec91f130f	Lazy decoding of XML/HTML entities. Instead of decoding entities in the lexer we'll do this whenever XML::Text#text is called. This removes the overhead from the parsing phase and ensures the process is only triggered when actually needed. Note that calling #to_xml and/or the #inspect methods on a Text (or parent) instance will also trigger the entity conversion process. The new entity decoding API supports both regular entities (e.g. &) as well as codepoint based entities (both regular and hexadecimal codepoints). To allow safe read-only access to Text instances from multiple threads a mutex is used. This mutex ensures that only 1 thread can trigger the conversion process. Fixes #68	2015-03-05 23:00:43 +01:00
Yorick Peterse	317b49bcf6	Implemented a basic SAX API. This API is a little bit dodgy (similar to Nokogiri's API) due to the use of separate parser and handler classes. This is done to ensure that the return values of callback methods (e.g. on_element) aren't used by Racc for building AST trees. This also ensures that whatever variables are set by the handler don't conflict with any variables of the parser. This fixes #42.	2014-09-16 14:30:46 +02:00
Yorick Peterse	d8e2b97031	Tweaked docs of the XML parsers.	2014-09-04 09:34:59 +02:00
Yorick Peterse	8237d5791d	Stream tokens when lexing. Instead of returning the tokens as a whole they are now streamed using XML::Lexer#advance. This method returns the next token upon every call. It uses a small buffer in case a particular block of text results in multiple tokens.	2014-04-09 22:08:13 +02:00
Yorick Peterse	79818eb349	Added a convenience class for parsing HTML. This removes the need for users having to set the `:html` option themselves.	2014-03-25 09:40:24 +01:00

8 Commits