What happens during evaluation in R? -


i'm interested in how basic thing, evaluation, works in r.

i came r biologist, , yet interested in related code, it's still bit mysterious.

i think understand properly:

  • that exists object in r
  • that happens function call (same ref)
  • what environment , how lazy evaluation works
  • (more or less) happens during compilation compiled language.

but technically, happens behind curtain when evaluate in r, when press enter after (or more) line(s) of code?

i have found this, in r language definition core team:

when user types command @ prompt (or when expression read file) first thing happens that command transformed parser internal representation. evaluator executes parsed r expressions , returns value of expression. expressions have value. core of language.

but abstruse me (particularly boldtype part) , subsection not me disentangle this.

do have open fundamental book on informatics understand this, or there way understand, technically, i'm doing 8 hours day?

this going incomplete answer, seems question nature of "internal representation." in essence, r's parser takes arbitrary r code, removes irrelevant stuff (like superfluous whitespace) , creates nested set of expressions evaluate. can use pryr::call_tree() see going on.

take simple expression uses mathematical operators:

> 1 + 2 - 3 * 4 / 5 [1] 0.6 

in series of operations, output occurs respects r's precedence rules. happening? first, parser converts whatever typed "expression":

> parse(text = "1 + 2 - 3 * 4 / 5") expression(1 + 2 - 3 * 4 / 5) 

this expression masks deeper complexity:

> library("pryr") > call_tree(parse(text = "1 + 2 - 3 * 4 / 5")) \- ()   \- `-   \- ()     \- `+     \-  1     \-  2   \- ()     \- `/     \- ()       \- `*       \-  3       \-  4     \-  5 

this expression sequential evaluation of 4 functions, first "*"(), "/"(), "+"(), "-"(). thus, can rewritten nested expression:

> "-"("+"(1,2), "/"("*"(3,4), 5)) [1] 0.6 > call_tree(parse(text = '"-"("+"(1,2), "/"("*"(3,4), 5))')) \- ()   \- `-   \- ()     \- `+     \-  1     \-  2   \- ()     \- `/     \- ()       \- `*       \-  3       \-  4     \-  5 

multi-line expressions parsed individual expressions:

> parse(text = "1; 2; 3") expression(1, 2, 3) > parse(text = "1\n2\n3") expression(1, 2, 3) > call_tree(parse(text = "1; 2; 3")) \-  1  \-  2  \-  3 

these call trees evaluated.

thus when r's read-eval-print loop executes, parses code typed in interpreter or sourced file call tree structure, sequentially evaluates each function call, , prints result unless error occurs). errors occur when parsable line of code cannot evaluated:

> call_tree(parse(text = "2 + 'a'")) \- ()   \- `+   \-  2   \-  "a" 

and parsing failure occurs when typable line of code cannot parsed call tree:

> parse(text = "2 + +") error in parse(text = "2 + +") : <text>:2:0: unexpected end of input 1: 2 + +    ^ 

that's not complete story, perhaps gets part way understanding.


Comments