Tag Archives: data science

Emacs Apologia (2019)

Its 2019. I’ve been using Emacs for more than a decade and I’m not inclined to stop. Sometimes, my colleagues get on my case about it – why not use (for instance) RStudio or Jupyter or whatever other IDEs are floating around out there.

They’ve got a point: if you’re doing something, its hard for Emacs to beat a custom solution which usually has much bigger mind share and corporate support to boot.

But most of the time I’m not doing one thing. I’m doing a few, related, things and its in this context where Emacs shines. I tell my friends that Emacs is a general purpose text-based task thingamajig.

Imagine the scene!

You’ve got a problem you’re working on in R. Because you’re extremely professional, you do your work in a dev environment which is reified as a docker image. You realize you need to add a dependency – so you just say in your *R* buffer (maybe you’re using ESS or maybe not – I don’t).

> install.packages("gbm")

You also add the dependency to your deps.R script which your docker file runs. M-x shell creates a new shell, where you

> docker build . -t ds

And your container is updated in the background. Maybe you find yourself doing this a lot, so you say

(defun do-build ()
  (interactive)
  (comint-send-string 
   (get-buffer-process (shell "*docker-build*"))
   (format "docker build . -t ds\n")))
   

And then you can just M-x local-set-key C-c C-c do-build and its just a keystroke away.

While that is happening your trying to figure why some values are turning up NA when you try to read from an sqlite DB into a data frame. You want to inspect the database manually. So you - M-x shell *sqlite* then.

> docker run ... sqlite3 /path/to/sqlite.db

Now, you want to run exactly the sql you’ve got in your R script, so you write the following absolute gem of a function:

(defvar *default-comint-buffer*
  nil)
(defun region->comint (s e)
  (interactive "r")
  (let* ((bufs (get-buffers-with-processes))
         (dflt (or *default-comint-buffer*
                   (car bufs)))
         (buffer (completing-read "where? " bufs nil t dflt))
         (s (concat (buffer-substring s e)
                    (format "\n"))))    
    (comint-send-string (get-buffer-process (get-buffer buffer))
                        s)
    (pop-to-buffer (get-buffer buffer))
    (setq *default-comint-buffer* buffer)))
    

And now you can highlight sql fragments in any buffer and M-x region->comint *sqlite* and you’ll execute that code and jump to the buffer.

And region->comint will do an enormous amount of leg work for you. Suppose your project uses multiple languages: R for one step, Python for another. A hassle if you’re using a Notebook or RStudio, but relatively easy to orchestrate inside Emacs.

Sure, lots of stuff is missing. People really love Tab completion and its not always perfect in Emacs.

But if you do complicated, multi-environment, text based tasks Emacs is still, far and away, the best tool for the job. And the fact that it works over a terminal and can act as a server, which means you can pop in and out as you need, leaving the environment up for months at a time. These days I keep multiple Emaxen running as daemons, one for each active project.

Emacs is indispensible for me especially in 2019.