123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103moduleAst=Ast(* odoc uses an ocamllex lexer. The "engine" for such lexers is the standard
[Lexing] module.
As the [Lexing] module reads the input, it keeps track of only the byte
offset into the input. It is normally the job of each particular lexer
implementation to decide which character sequences count as newlines, and
keep track of line/column locations. This is usually done by writing several
extra regular expressions, and calling [Lexing.new_line] at the right time.
Keeping track of newlines like this makes the odoc lexer somewhat too
diffiult to read, however. To factor the aspect of keeping track of newlines
fully out of the odoc lexer, instead of having it keep track of newlines as
it's scanning the input, the input is pre-scanned before feeding it into the
lexer. A table of all the newlines is assembled, and used to convert offsets
into line/column pairs after the lexer emits tokens.
[offset_to_location ~input ~comment_location offset] converts the byte
[offset], relative to the beginning of a comment, into a location, relative
to the beginning of the file containing the comment. [input] is the comment
text, and [comment_location] is the location of the comment within its file.
The function is meant to be partially applied to its first two arguments, at
which point it creates the table described above. The remaining function is
then passed to the lexer, so it can apply the table to its emitted tokens. *)letoffset_to_location:input:string->comment_location:Lexing.position->(int->Odoc_model.Location_.point)=fun~input~comment_location->letrecfind_newlinesline_numberinput_indexnewlines_accumulator=ifinput_index>=String.lengthinputthennewlines_accumulatorelse(* This is good enough to detect CR-LF also. *)ifinput.[input_index]='\n'thenfind_newlines(line_number+1)(input_index+1)((line_number+1,input_index+1)::newlines_accumulator)elsefind_newlinesline_number(input_index+1)newlines_accumulatorinletreversed_newlines:(int*int)list=find_newlines10[(1,0)]infunbyte_offset->letrecscan_to_last_newlinereversed_newlines_prefix=matchreversed_newlines_prefixwith|[]->assertfalse|(line_in_comment,line_start_offset)::prefix->ifline_start_offset>byte_offsetthenscan_to_last_newlineprefixelseletcolumn_in_comment=byte_offset-line_start_offsetinletline_in_file=line_in_comment+comment_location.Lexing.pos_lnum-1inletcolumn_in_file=ifline_in_comment=1thencolumn_in_comment+comment_location.Lexing.pos_cnum-comment_location.Lexing.pos_bolelsecolumn_in_commentin{Odoc_model.Location_.line=line_in_file;column=column_in_file}inscan_to_last_newlinereversed_newlinesletmake_parser~location~textparse=Odoc_model.Error.accumulate_warningsbeginfunwarnings->lettoken_stream=letlexbuf=Lexing.from_stringtextinletoffset_to_location=offset_to_location~input:text~comment_location:locationinletinput:Lexer.input={file=location.Lexing.pos_fname;offset_to_location;warnings;lexbuf;}inStream.from(fun_token_index->Some(Lexer.tokeninputlexbuf))inparsewarningstoken_streamendletparse_comment_raw~location~text=make_parser~location~textSyntax.parseletparse_comment~sections_allowed~containing_definition~location~text=make_parser~location~text(funwarningstoken_stream->Syntax.parsewarningstoken_stream|>Semantics.ast_to_commentwarnings~sections_allowed~parent_of_sections:containing_definition)