Tech Blog

Django shell_plus with Pandas, and Jupyter Notebook

Django shell_plus with Pandas, and Jupyter Notebook

Recently updated on

The Django shell provides an environment where developers can interact with the database via Django's ORM. While the shell is great for basic interactions, it quickly becomes laborious to work in, be it due to manual imports, having to scroll through shell history to repeat commands, or working with / viewing queries returning more than, say, 20 records. These issues, and more, can be remedied by interacting with the ORM in Jupyter notebooks, using Pandas.

Our environment will be setup inside a virtual environment using Django 2.2.  Once inside the virtual environment, install IPython, Jupyter, Pandas, django-extensions, and django-pandas.

pip install ipython jupyter pandas django-extensions django-pandas

django-extensions needs to be placed into INSTALLED_APPS in your settings.py file.

INSTALLED_APPS = [
    ...
    'django_extensions',
]

Once installed, then run

python manage.py shell_plus --notebook

shell_plus is Django shell with autoloading of the apps database models and subclasses of user-defined classes. The --notebook argument tells shell_plus to open in a Jupyter notebook.

You will notice that a new Jupyter tab opens in your browser. In the upper right corner click on the New drop-down and select Django Shell-Plus.

A new tab will open and you will see your new Jupyter notebook.  Import django-pandas read_frame.

Let's explore our new and improved shell environment by interacting with the following models.

class Movie(models.Model):                                      
                                                                
    title = models.CharField(max_length=100)                    
    year = models.IntegerField()                                
                                                                
    def __str__(self):                                          
        return f"{self.title} ({self.year})"                    
                                                                
                                                                
class Actor(models.Model):                                      
                                                                
    name = models.CharField(max_length=100)                     
    movies = models.ManyToManyField('Movie')                    
                                                                
    def __str__(self):                                          
        return self.name

Lets say we want to explore Tom Hanks movies.  We pass read_frame a QuerySet, and it returns a Pandas DataFrame.  Now, of course, you can bring all the powers of Pandas to bear on your data.

For a basic example,  lets say we want to see all of his movies after 2015, in alphabetical order.

It is true that something similar could be done without pandas, but it isn't nearly as clean and concise, and you would just be trying to recreate what pandas already does so well.

Whether you use Pandas or not, since your code exists in Jupyter notebook cells, there is no need to scroll through the shell history to reuse/change code snippets.

The above is just a taste of what can be done in this environment.

Further reading:

Jupyter Notebook Tutorial

Pandas Tutorial

Pandas and SQL Tutorial