{"id":242,"date":"2009-10-14T10:01:57","date_gmt":"2009-10-14T09:01:57","guid":{"rendered":"http:\/\/floris.briolas.nl\/floris\/?p=242"},"modified":"2010-01-25T11:09:01","modified_gmt":"2010-01-25T10:09:01","slug":"about-white-space-and-antlr","status":"publish","type":"post","link":"https:\/\/floris.briolas.nl\/floris\/2009\/10\/about-white-space-and-antlr\/","title":{"rendered":"About WhiteSpace and Antlr"},"content":{"rendered":"<p>Are you unsure if you need to include the space in your rules?<\/p>\n<p>Most grammars will have this LEXER rule in their grammar:<\/p>\n<pre name=\"code\" class=\"antlr\">WS  :   ( ' '\r\n        | '\\t'\r\n        | '\\r'\r\n        | '\\n'\r\n        ) {$channel=HIDDEN;}\r\n    ;<\/pre>\n<p>Now you&#8217;ll like a rule that needs to match &#8220;a sample string&#8221; (quotes included)? How will Antlr respond to the space?<\/p>\n<p>or<\/p>\n<p>You want to parse a command line like parameter alike string such as &#8220;operation \/option1 \/option2&#8221; (quotes not included).<\/p>\n<pre  name=\"code\" class=\"antlr\">grammar test20091014;\r\n\r\n\/\/thanks to AntlrWorks 1.3 for it's useful grammar wizard.\r\n\r\nprog\t:\tSTRING+ OPTION*;\r\n\r\nOPTION\t:\t'\/' STRING;\r\n\r\nSTRING\r\n    :  '\"' ( ESC_SEQ| ~('\\\\'|'\"') )* '\"'\r\n    |\tID\r\n    ;\r\n\r\nfragment ID  :\t('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*\r\n    ;\r\n\r\nWS  :   ( ' '\r\n        | '\\t'\r\n        | '\\r'\r\n        | '\\n'\r\n        ) {$channel=HIDDEN;}\r\n    ;\r\n\r\nfragment\r\nHEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;\r\n\r\nfragment\r\nESC_SEQ\r\n    :   '\\\\' ('b'|'t'|'n'|'f'|'r'|'\\\"'|'\\''|'\\\\')\r\n    |   UNICODE_ESC\r\n    |   OCTAL_ESC\r\n    ;\r\n\r\nfragment\r\nOCTAL_ESC\r\n    :   '\\\\' ('0'..'3') ('0'..'7') ('0'..'7')\r\n    |   '\\\\' ('0'..'7') ('0'..'7')\r\n    |   '\\\\' ('0'..'7')\r\n    ;\r\n\r\nfragment\r\nUNICODE_ESC\r\n    :   '\\\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT\r\n    ;<\/pre>\n<p>Parse Tree:<\/p>\n<div id=\"attachment_243\" style=\"width: 142px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-243\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-243\" title=\"input: str \/opt1\" src=\"http:\/\/floris.briolas.nl\/floris\/wp-content\/uploads\/2009\/10\/exampl1.png\" alt=\"input: str \/opt1\" width=\"132\" height=\"142\" \/><p id=\"caption-attachment-243\" class=\"wp-caption-text\">input: str \/opt1<\/p><\/div>\n<div id=\"attachment_244\" style=\"width: 291px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-244\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-244\" title=\"&quot;str with space&quot; \/&quot;some option&quot;\" src=\"http:\/\/floris.briolas.nl\/floris\/wp-content\/uploads\/2009\/10\/exampl2.png\" alt=\"&quot;str with space&quot; \/&quot;some option&quot;\" width=\"281\" height=\"144\" \/><p id=\"caption-attachment-244\" class=\"wp-caption-text\">&quot;str with space&quot; \/&quot;some option&quot;<\/p><\/div>\n<div id=\"attachment_246\" style=\"width: 458px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-246\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-246\" title=\"&quot;str with space&quot; another space \/&quot;some option&quot;\" src=\"http:\/\/floris.briolas.nl\/floris\/wp-content\/uploads\/2009\/10\/exampl3.png\" alt=\"&quot;str with space&quot; another space \/&quot;some option&quot;\" width=\"448\" height=\"147\" srcset=\"https:\/\/floris.briolas.nl\/floris\/wp-content\/uploads\/2009\/10\/exampl3.png 448w, https:\/\/floris.briolas.nl\/floris\/wp-content\/uploads\/2009\/10\/exampl3-300x98.png 300w\" sizes=\"(max-width: 448px) 100vw, 448px\" \/><p id=\"caption-attachment-246\" class=\"wp-caption-text\">&quot;str with space&quot; another space \/&quot;some option&quot;<\/p><\/div>\n<p>Conclusion:<\/p>\n<p>No explicit need to include the WS in your lexer rule. &#8220;WS&#8221; will result in a token and splits input such as\u00a0 &#8220;a b c d&#8221; (quote not include.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Are you unsure if you need to include the space in your rules? Most grammars will have this LEXER rule in their grammar: WS : ( &#8216; &#8216; | &#8216;\\t&#8217; | &#8216;\\r&#8217; | &#8216;\\n&#8217; ) {$channel=HIDDEN;} ; Now you&#8217;ll like a rule that needs to match &#8220;a sample string&#8221; (quotes included)? How will Antlr respond [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[3],"tags":[6],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p61yPs-3U","_links":{"self":[{"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/posts\/242"}],"collection":[{"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/comments?post=242"}],"version-history":[{"count":10,"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/posts\/242\/revisions"}],"predecessor-version":[{"id":274,"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/posts\/242\/revisions\/274"}],"wp:attachment":[{"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/media?parent=242"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/categories?post=242"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/floris.briolas.nl\/floris\/wp-json\/wp\/v2\/tags?post=242"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}