Hi,
I am getting following message. When i want to compute simple linear model, y=mx+c. I have 12000 rows.
*** ERROR 10037 *** PROBLEM EXCEEDS AVAILABLE RESOURCES
Please let me know.
thanks,
Suresh
Posted 28 June 2016 - 12:39 AM
The error message is telling, isn’t it?
I guess that PHX was not built for such a high number of cases and your are running out of available RAM.
12,000 rows are peanuts for R. Example:
r <- 5e7 # rows c <- 2 # columns x <- rnorm(r, mean = 1) # generate random y <- x + rnorm(r, mean = 0.1) # data (slope 1, intercept 0.1) data <- matrix(c(x, y), nrow=r, ncol=c, dimnames=list(rep(NULL, r), c("x", "y"))) # fit a linear model with intercept mod <- lm(y ~ x, data=as.data.frame(data)) print(mod)
Takes a couple of seconds on my machine (16 GB RAM) to generate a data set with 50 million (!) rows of random data and fit a linear model.
Call: lm(formula = y ~ x, data = as.data.frame(data)) Coefficients: (Intercept) x 0.0977 0.99999
Even in basic R sooner or later you will run out of RAM (the function lm() does not swap to disk)… If you have really large data sets, consider package biglm (https://cran.r-proje...g/package=biglm).
Posted 28 June 2016 - 02:16 PM
Hi Simon.
It may also be you've not set up the problem well, e.g. no initial estimates but saying that you would provide them. Please either post the project here or email it to support.
Attached an example with 12,000 rows. Only the WNL model 502 fails: No problems if coded as a PHX-model.
Posted 28 June 2016 - 02:53 PM
Hi Simon.
Attached an example with 12,000 rows. Only the WNL model 502 fails: No problems if coded as a PHX-model.
It may also be you've not set up the problem well, e.g. no initial estimates but saying that you would provide them. Please either post the project here or email it to support.
thanks, Simon.
Hi,
I tried with and without initial estimates, still gave me an error. I think this is related to rows only. with more than 3000 rows it doesn't work.
Thanks,
Suresh
Posted 28 June 2016 - 02:55 PM
Hi Simon.
Attached an example with 12,000 rows. Only the WNL model 502 fails: No problems if coded as a PHX-model.
Hi,
Thanks for PHX model. It seems working. Can you send me the same code for PHX build 6.3.0.395. when i opened Manyfiles.phx project it gave an error due to version change.
thanks,
Suresh
Posted 28 June 2016 - 02:56 PM
The error message is telling, isn’t it?
I guess that PHX was not built for such a high number of cases and your are running out of available RAM.
12,000 rows are peanuts for R. Example:r <- 5e7 # rows c <- 2 # columns x <- rnorm(r, mean = 1) # generate random y <- x + rnorm(r, mean = 0.1) # data (slope 1, intercept 0.1) data <- matrix(c(x, y), nrow=r, ncol=c, dimnames=list(rep(NULL, r), c("x", "y"))) # fit a linear model with intercept mod <- lm(y ~ x, data=as.data.frame(data)) print(mod)Takes a couple of seconds on my machine (16 GB RAM) to generate a data set with 50 million (!) rows of random data and fit a linear model.
Call: lm(formula = y ~ x, data = as.data.frame(data)) Coefficients: (Intercept) x 0.0977 0.99999Even in basic R sooner or later you will run out of RAM (the function lm() does not swap to disk)… If you have really large data sets, consider package biglm (https://cran.r-proje...g/package=biglm).
Hi,
If i nothing works in PHX, i will try in R.
thanks for the code,
Suresh
Posted 28 June 2016 - 04:25 PM
Note that the linear model 502 in WNL uses least squares minimization (O)LS, whereas all PHX-models use restricted maximum likelihood (REML).
Without knowing the background of your problem it is difficult to recommend anything. Note that in generating my data I used a variance of 1. For my data set in I got with the PHX-model:
a0 0.0769702 a1 1.01384 stdev0 0.989692
Note that not only the parameters (a0, a1) are estimated but the error as well.
In R:
library(nlme) mod1 <- lm(y ~ x, data=data) # OLS mod2 <- lme(y ~ x, random= ~ 1 | x, data=data) # REML by default summary(mod1) summary(mod2)
Gives:
Call: lm(formula = y ~ x, data = data) Residuals: Min 1Q Median 3Q Max -3.9928 -0.6668 0.0048 0.6620 5.3788 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.077016 0.012763 6.034 1.64e-09 *** x 1.013820 0.009098 111.434 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.9898 on 11998 degrees of freedom Multiple R-squared: 0.5086, Adjusted R-squared: 0.5086 F-statistic: 1.242e+04 on 1 and 11998 DF, p-value: < 2.2e-16 Linear mixed-effects model fit by REML Data: data AIC BIC logLik 33829.03 33858.6 -16910.52 Random effects: Formula: ~1 | x (Intercept) Residual StdDev: 0.05123659 0.9884501 Fixed effects: y ~ x Value Std.Error DF t-value p-value (Intercept) 0.076984 0.01276893 10071 6.02901 0 x 1.013845 0.00910216 10071 111.38508 0 Correlation: (Intr) x -0.706 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -4.028606196 -0.672923072 0.004855385 0.667628142 5.427081375 Number of Observations: 12000 Number of Groups: 10073
nlme/lme() comes close to the estimates of PHX. Personally for such a simple problem I would prefer (O)LS.
0 members, 0 guests, 0 anonymous users