Tuesday, November 3, 2015

Hey what's the deal with the ENMTools R package?

It has come to my attention that at least one person is actually using the ENMTools R package I sorta half-made a couple of years ago, for which I would like to express my deepest condolences.

Seriously, though, I did want to at least acknowledge its existence and the absolutely massive caveats that should come with any attempt to use it in its current state.  

The package exists because I needed a project in order to learn R; I've found that reading a book and doing examples is one thing, but to really assimilate a new language I need to have a project that makes me sit down and work on it every day.  When I started my postdoc at ANU a few years ago, I said to myself "I am going to do everything in R from this day forward, and in order to learn R I will rewrite as much of ENMTools as I need to to feel like I've mastered it".  

So that's what I did.  I wrote bits to generate reps for most of the major tests in ENMTools, including the background, identity, and rangebreak tests.  I also wrote code to measure breadth and overlap using the metrics in ENMTools, and a couple of other little utility functions.  That helped me get comfortable with the basics in R, and at that point I got busy enough with my actual postdoc work that I had to drop it.  

And that's pretty much where it stands today, a couple of years later.  It mostly works, but it ain't exactly pretty or well documented - it was my first R project, after all.  While some of its functionality has already been duplicated elsewhere (e.g., the identity and background tests in phylocom), some of it hasn't (e.g., the rangebreak tests).  Now that I've been writing R pretty much daily for the past three years, I see a million things I did sub-optimally, and a bunch of areas where I could have taken advantage of existing functionality to do things more quickly, more cleanly, and with a lot more cool bells and whistles.

So why do I bring this up?  First, as I mentioned, because apparently some people are actually using it.  I'm not sure whether that's due to masochism or desperation, but they are.  Second, and more importantly, because I'm going to try to bash it into a somewhat more useful form over the next however-long.  It's probably not going to duplicate all of the functionality of the original ENMTools, but the eventual goal is to include a lot of very cool stuff that the old version didn't have.  If you want to contribute or are brave enough to muck around with it in its current state, it's here:

2 comments:

  1. Hey Dan,

    I have been looking into your ENMTools package recently. There are two tools I'm interested in:

    1) how to calculate AIC with Maxent.
    2) comparing predictions from SDMs using distance metrics (your niche similarity)

    I'm happy to offer whatever help I can provide in developing the package (as these are problems I'm very interested in)...or if it turns out I don't have the necessary skills to help develop it, I am definitely interested in using it. Thanks for developing this package!

    Ps. I'm a grad student at UT, and stavana called my attention to this blog. Thanks for hosting it!

    ReplyDelete
  2. Hey there! Part 2 is already implemented in the R stuff, but could certainly stand to be improved. Part 1 I'm not so sure about - although the simulation results on AIC were good, it's still a very weird fit for Maxent models for a number of reasons (some mentioned in that paper, some in Warren et al 2014, and some in a couple of other recent papers). My thought is that if someone's working in R and wants to do AIC (both of which I think are great ideas), it's probably more reasonable to just use one of the methods for which AIC is a more natural fit (e.g., GLM or something like that). That's the main reason I hadn't yet coded AIC into the R version, and I'm not really sure whether that's worth doing in the future.

    ReplyDelete