Ah, with a bit more research i've answer my own question based on this answer: Regression on subset of data set Must look harder next time. I used ddply instead of lmList (makes me wonder why anyone uses lmList...maybe I should ask another question?): res1 <- ddply(sub, c("ERF", "Wafer"), function(x)...

Got the clue from this post: Why do R and statsmodels give slightly different ANOVA results? When the data is being read from CSV file, the Diet column is becoming an ordinary numeric column, but for ANOVA it has to be a factor variable (I am still not clear why...

With the default R formula syntax the * not only includes the interaction terms, but also includes the individual terms. If you just want the interaction term, then you the : operator. So you in your case, you want fe1 <- summary(lm(qnorm(y) ~ factor(Bank) -1 + factor(Country):x ,data=PDwideHPI)) ...

r,expression,offset,lm,coefficients

One way of handling this would be to calculate a new column with your total offset and remove the columns used in your offset from the data set: # create copy of data withou columns used in offset dat <- df[-match(inputs_fix, names(df))] # calculate offset dat$offset <- 0 for (i...

You are encountering how ordinal factor variables are handled by regression functions and the default set of contrasts are orthogonal polynomial contrasts up to degree n-1, where n is the number of levels for that factor. It's not going to be very easy to interpret that result ... especially if...

This is due to the fact 2011^3 is a very big number (greater tha and this is causing the coeffiicent to be returned as NA. If you had inspected the models, you would have noticed this. coef(lm(attend ~ year + I(year^2) + I(year^3),ds)) # (Intercept) year I(year^2) I(year^3) # -7.025524e+04...

It isn't a nicest solution, but does what you want. library(MuMIn) options(na.action = na.fail) fm1 <- lm(y ~ X1 + X2, Cement) m1 <- dredge(fm1) ms1 <- subset(m1, delta < 32) fm2 <- lm(y ~ X3 + X4, Cement) m2 <- dredge(fm2) ms2 <- subset(m2, delta < 20) a1 <-...

Note that lm() returns an object of class "lm" and summary() on that object produces a "summary.lm" object. There are custom print.lm() and print.summary.lm() objects. So what ever is printed to the console may be different than what's in the object itself. When you manually concatenate (c()) two summary.lm objects,...

The problem is that once the NA values are omitted from the data set, there aren't any "nicht erlebt" observations left: summary(na.omit(minimal)) swls exp.factor Min. :1.200 erlebt :64 1st Qu.:4.400 nicht erlebt: 0 Median :5.500 Mean :5.119 3rd Qu.:6.200 Max. :7.000 So lm is going to have trouble fitting a...

Here's a vote for the plyr package and ddply(). plyrFunc <- function(x){ mod <- lm(b~c, data = x) return(summary(mod)$coefficients[2,3]) } tStats <- ddply(dF, .(a), plyrFunc) tStats a V1 1 a 1.6124515 2 b -0.1369306 3 c 0.6852483 ...

This is a problem of using different names between your data and your newdata and not a problem between using vectors or dataframes. When you fit a model with the lm function and then use predict to make predictions, predict tries to find the same names on your newdata. In...

No, you are not correct. You would be correct if you had done this: lm( yvar ~ xvar + as.numeric(xfac) +I(as.numeric(xfac)^2), data=dat) But that's not the same as what R does when it encounters such a situation. Whether or not the quadratic term will "weaken" the linear estimate really depends...

Try: lines(sort(hp), fitted(fit)[order(hp)], col='red', type='b') Because your statistical units in the dataset are not ordered, thus, when you use lines it's a mess....

r,statistics,prediction,lm,predict

There are ways to transform your response variable, G in this occasion but there needs to be a good reason to do this. For example, if you want the output to be probabilities between 0 and 1 and your response variable is binary (0,1) then you need a logistic regression....

r,regression,lm,beta,standardized

There is a convenience function in the QuantPsyc package for that, called lm.beta. However, I think the easiest way is to just standardize your variables. The coefficients will then automatically be the standardized "beta"-coefficients (i.e. coefficients in terms of standard deviations). For instance, lm(scale(your.y) ~ scale(your.x), data=your.Data) will give you...

Try this: update(lm1,as.formula(paste0(".~.+I(latitude^",i,")"))) Your code doesn't work, because it is formula and R takes it 'as is'. It does not resolve variable i to value 1,2,...10. So you need to paste i first with rest of the formula and then tell R it is formula. ...

The R-squared, adjusted R-squared, and all other values you see in the summary are accessible from within the summary object. You can see everything by using str(summary(M.lm)): > str(summary(M.lm)) # Truncated output... List of 11 $ call : language lm(formula = MaxSalary ~ Score, data = salarygov) $ terms :Classes...

The newdata parameter should be a data.frame with column names matching the names used as covariates. So the correct case is predict(lm(x~y),newdata=data.frame(y=2.5)) or predict(lm(y~x),newdata=data.frame(x=2.5)) depending on which way you wanted to do the regression....

If you only need the coefficients, you can try this: library(data.table) setDT(df) dafr <- df[, as.list(lm.fit(cbind(1, b), a)$coef), by=list(c, d)] setnames(dafr, c("c", "d", "intercept", "slope")) # c d intercept slope #1: 1 5 1.869449e-13 0.5 #2: 2 6 5.176935e-13 0.5 #3: 3 7 5.000000e+02 0.5 #4: 4 8 5.000000e+02 0.5...

You got most of the code right. It would be better to use the time (tiempo) variable as an id variable in your melt call This will ensure the lengths of the data match up. library(reshape2) #This is faster version of reshape df.m <- melt(df.matias, id.var="Tiempo") #I stored your data...

model.sel result is a data.frame, so you can modify it (add model names, round numbers etc) and export to latex using e.g. latex from Hmisc package. # include R^2: R2 <- function(x) summary(x)$r.squared ms <- model.sel(m1, m2, m3, m4, extra = "R2") i <- 1:4 # indices of columns with...

It is better to post a reproducible example, otherwise we are guessing!! Look at this example and see if solves your problem. n=50 set.seed(123) d=data.frame(o=rnorm(n,10,3),t=1:n,w=rep(c("A","B","C"),length.out=n)) m=10 td=data.frame(o=rnorm(m,10,3),t=(n+1):(m+n),w=c("D","E",rep(c("A","B","C"),length.out=m-2))) model <- lm(o ~ t * w,data=d) cbind(td$o,predict(model,newdata=td[,-1])) #Erro here newlevels=levels(td$w)[!levels(td$w)%in%levels(d$w)] ntd=td ntd$w=factor(ifelse(td$w%in%newlevels,NA,td$w),labels=levels(d$w))...

It seems you want the update command. It allows you to add/drop terms from a formula easily. head(mtcars) # using mtcars for the example # Create your common controls in a formula common <- ~ cyl + hp + drat # Add the response and an additional predictor new_form <-...

simplify in terms.formula does the opposite to what you think it does. You actually want simplify = FALSE, but there's no way to do that using the default stats::update.formula. Here's a version that does what you want. Note that the default method has just been changed to use my version...

You get the error : Error: object of type 'closure' is not subsettable Because when ddply try to resolve t in the local environment before the global environment. Indeed, it found the transpose function (closure) t and not your global variable t. You need just to change to something other...

Use substet parameter in lm function lm(Y ~ P, data=df, subset=df$P %in% 1:3)...

You're looking for the lsmeans package. Check it out: lstrends(mod, specs = c('cat1', 'cat2', 'cat3'), var = 'cont1') cat1 cat2 cat3 cont1.trend SE df lower.CL upper.CL a c e 0.01199024 0.08441129 984 -0.15365660 0.1776371 b c e 0.01083637 0.08374605 984 -0.15350502 0.1751778 a d e 0.03534914 0.09077290 984 -0.14278157 0.2134799...

As @MrFlick suggested, using summary.lm can help. Code: # Build Sample Data df <- data.frame(y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100), z1 = rnorm(100), z2 = rnorm(100), z3 = rnorm(100)) # Run Model sum <- summary.lm(lm(y ~ x1 + x2 + x3 + z1 +...

You can do this with a simple lapply, like this: lapply(l, lm, formula = a ~ c) ...

When you do a prediction the names of all the columns used as predictors in the model must be the same as the column in the new data. Using your sample data.frames above, this should work #change name to match the model data names(df.newData)<-"ts.in" #this should be true # >...

w needs to be 3x3 so make use diag to construct w as a matrix with those values on the diagonal instead of using a vector x <- matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=T) xt <- t(x) w <- diag(c(7,8,9)) xt %*% w %*% x ...

This isn't pure "dplyr", but rather, "dplyr" + "tidyr" + "data.table". Still, I think it should be pretty easily readable. library(data.table) library(dplyr) library(tidyr) mtcars %>% gather(var, val, cyl:carb) %>% as.data.table %>% .[, as.list(summary(lm(mpg ~ val))$coefficients[2, 1:2]), by = var] # var Estimate Std. Error # 1: cyl -2.87579014 0.322408883 #...

This is not directly saved as a TRUE/FALSE flag in the model object. A way to make this work would be grepl("log", names(m1$model)[[1]]) grepl("log", names(m2$model)[[1]]) which will search for the word "log" in the model part of the lm-object....

Building on the comments, gear isn't defined globally. It works inside the stand-alone lm call as you specify the data you are using, so lm knows to take gear from df. Howver, gear itself doesn't exist outside that stand-alone lm function. This is shown by the output of gear >...

The problem doesn't seem to be with the subset. I get the same error from your function when I change to subset = (state == 1). The arguments to your function aren't being passed and evaluated correctly. I think you'd be better off using do.call myfunction <- function(formula, data, subset)...

try using mean instead of sum like this ggplot(data = df, aes(x = Month, y = Count.V)) + stat_summary(fun.y = mean, geom ="line")+ stat_smooth(method = "lm", formula = y ~ poly(x, 3), size = 1) + geom_point()+ scale_x_date(labels = date_format("%m-%y"), breaks = "3 months") ...

A solution would be to use lines() and have two predictions for both extremes of x. See this example: x <- rnorm(20) y <- 5 + 0.4*x + rnorm(20)/10 dt <- data.frame(x=x, y=y) ols1 <- lm(y ~ x, data=dt) nd <- data.frame(x=range(x)) ## generate new data with the two extremes...

r,subset,dataframes,lm,subsetting

With your fit list, you can extract the coefficients and r-squared values with fit<-apply(aa,2,function(x) lm(x~aa$Tiempo)) mysummary <- t(sapply(fit, function(x) { ss<-summary(x); c(coef(x), r.square=ss$r.squared, adj.r.squared=ss$adj.r.squared) })) We use sapply to go over the list you created and extract the coefficients from the model and the r-squared values from the summary. The...

This is a typical split-apply-combine type operation: split the data into the relevant chunks, in this case by day then bod apply a function to each chunk of data; in this case a linear model & extract the slope coefficient, combine the slope parameters plus the aggregation data into a...

Here you go: sapply(fit,function(x) summary(x)$r.squared) 11 12 0.9657143 0.9657143 Or to do everything at once: sumfun <- function(x) c(coef(x),summary(x)$r.squared) t(sapply(fit,sumfun)) (you need to transpose the results from sapply to get the table as specified above). Then use names() <- or setNames() to get the column names the way you want...

Wrap a capture.output() around it write(capture.output(summary(fit)), "fit.txt") You'll get a nice clean .txt file with everything shown in R console when you evaluate summary(fit). I do not recommend writing to .csv, but you if you insist, just use write.csv. And for your/future reader's reference, if you do decide to automate...

You are not allowing dynlm to use the same amount of data as in lm. The latter model contains two fewer observations. dim(model.frame(reg1)) # [1] 24 7 dim(model.frame(lmx)) # [1] 22 7 The reason is that withlm you are transforming the variables (differencing) with the entire data set (31 observations),...

You can try something like: eval(form[[2]]) Normally you will have y, x1 and x2 as columns of a data.frame, e.g. df, and not objects in your global environment. In this case you can use: eval(form[[2]], envir = df) ...

models <- lapply(dsets, function(data){ lm(reformulate(termlabels=".", response=names(data)[1]), data) }) reformulate allows you to construct a formula from character strings....

You can use the broom package to get the information easily. It's just a matter of adjusting the table you get to suit your needs: model <- lm(formula = mpg ~ interaction(gear, am, drop = T) - 1 + cyl, data = mtcars) library(broom) tidy(model)[grep("interaction", tidy(model)$term),] # term estimate std.error...

Try this: ss <- scatterplot3d(CO2umol,NEE,GS,pch=20, highlight.3d=TRUE, main="NEE: AC vs EC vs MOD") fit <- lm( GS ~ CO2umol+NEE, OBSvsMOD_NEE_hourly) ss$plane3d(fit) You should be working with the plane3d element of the object that is returned by scatterplot3d (ss$plane3d), not trying to find a plane3d element in scatterplot3d itself ......

You haven't provided a reproducible example (i.e., data and code that allows others to reproduce your error), but I don't have a problem when I try something similar with a built-in data frame: m1 = lm(mpg ~ wt + carb + qsec*hp, data=mtcars) pred.dat=data.frame(carb=2, hp=120, qsec=10, wt=2.5) predict(m1, newdata=pred.dat) 1...

You need to specify on what variable you are subsetting. Easiest thing is replacing your sub.data <- data line with: sub.data <- data[(X>xinf & X<xsup),], so that you just have FitWeibull <- function(data, xinf, xsup){ sub.data <- data[(data$X > xinf & data$X < xsup),] my.lm <- lm(Y~X, data = sub.data)...

Try this ## Create some data set set.seed(1) my.data <- list(df1 = data.frame(Y = sample(10), X = sample(10)), df2 = data.frame(Y = sample(10), X = sample(10))) FitWeibull <- function(data){ my.lm1 <- lm(data[[1]]$X ~ data[[1]]$Y) my.lm2 <- lm(data[[2]]$X ~ data[[2]]$Y) par(mfrow = c(1, 2), pty = "s") plot(data[[1]]$X , data[[1]]$Y) abline(my.lm1)...

Grab them directly from the model. No need for using summary(): > model2$coefficients (Intercept) x x2 x3 0.9309032 0.8736204 NA 0.5493671 ...

I'm not sure why you want this, but it is trivial to get from fit. First, it is best not to delve into fitted objects like this with $. Instead learn to use extractor functions. In this case, the equivalent of mean(fit$model[,2]) would be, for all columns of the data...

You only have a single observation in your data.frame. You can't fit a model with 5 parameters with only a single observation. You would need at least six observations to be able to fit parameters and have an estimate of the variance.

One option is to use felm function in lfe package. As stated in the package: The package is intended for linear models with multiple group fixed effects, i.e. with 2 or more factors with a large number of levels. It performs similar functions as lm, but it uses a special...