There are several ways of specifying arguments to functions in R; in lsfit(urb, infmor), the first argument x corresponds to the independent(s) (matrix or single vector), and the second argument y (dependent variable)
Other more modern modelling functions have a formula interface, for instance with the lm() function, you write: lm(infmor ~ urb + gnpserv) to produce a regression with urb and gnpserv as independents and infmor as dependent variable.
Some examples for formulae you can use. The formula interface lets you write the equations quite naturally, including - as in the third example - the inclusion of interaction terms.
infmor ~ urb | Bivariate regression infmor dependent |
infmor ~ urb + gnpserv | Two independents |
infmor ~ urb + gnpserv+urb*gnpserv | Same with an interaction term |
infmor ~ urb + continent | continent being a factor variable, R will generate dummy variables for all categories except the first, which will be the reference category |
infmor~log(urb)) | Of course you can use functions like the log function directly. |
The following symbols are used in a formula:
Symbol | example | Explanation |
---|---|---|
~ | y~x | Separates dependent from independents |
+ | +x | Add variable X |
- | -x | Remove variable x |
* | x*a | Interaction between these variables |
: | x:a | Include these variables and the interaction between them |
^ | (a+b+c)^3 | Include these variables and all interactions up to 3-way |
poly(x,3) | include a polynomial term | |
1 | -1 | Delete the Intercept |
Note that the - sign to remove terms is useful to update a formula (see this document for an example.
See the documentation for further details and more complex model specification.