Non-Linear Regressions on Large Scale Data With nlsLoop nls.multstart Package

A brief aside (I am in crunch mode and can’t write more). Getting into nls (very exciting) these days and learning quite a bit about it. One issue with fitting these models is the difficulty of finding appropriate starting values. When you are trying to fit one of these models finding starting values is not that hard. When you have thousands of products you are trying to fit diffusion models on to the problem becomes a bit more complicated.

If you consider that each product can have vastly different parameters for diffusion, fitting nls models could become hairy. Using a single set of starting values can lead to singular gradients for most of the products.

nlsLoop package solves this problem by brute forcing the starting values. You give them a range of values for each parameter and the package handles the search by optimizing based on AIC. Pretty nifty stuff.

My only wish is if this package used more of the cores I have sitting idle. The process is easily parallelizable. Perhaps once this mad rush is behind me I will contribute that part.

Check it out here: https://github.com/padpadpadpad/nlsLoop

EDIT:

After a couple of weeks, once my hurry subsided I came across nls.multstart package from the same author. Seems like he optimized his code himself.

https://cran.r-project.org/web/packages/nls.multstart/index.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.