Rails Performance: The Database
My primary toolkit is the Ruby on Rails framework, and I mainly work with applications that have been in production for a while, which usually means they’re in need of a little pick-me-up. I thought it would be good to write a mini series about the things I typically look for when developing and maintaining Rails apps.
Picking up a Database
When I start working on an app with history the first thing I do after bundling and running the test suite is create a diagram of the database. This used to take up lots of time because I was doing it by hand, but recently I have been using rails-erd to automate the process.
Running “rake erd” creates a pdf based off the applications models.
This is a massive timesaver but it does take away the process of building the ERD diagram - which is invaluable for pointing out red flags such as associations that are not needed and fields that are no longer used.
At secondary school I was lucky enough to take a Oracle certification in database design, and one of the core things we learned was Database Normalisation, which came with a set of guidelines for creating good databases.
The guidelines are broken into stages:
First Normal Form (1NF)
Related data should be grouped into separate tables, there should not be duplicated fields in the same table and each row should have a unique key (Primary Key).
Second Normal Form (2NF)
Separate fields into a new table that apply to multiple rows.
For example: if you have a projects table, the client should be in their own table so that when you update their address you only have to do it in one place, not in 10.
Tables should be related to each other using primary and foreign keys to join the related tables.
Third Normal Form (3NF)
Ensure the columns in the table are dependent on the primary key.
For example, if you have a teacher who is part of a department, they should be in separate tables because department is it own entity that can exist even if it hasn’t got any teachers.
Fourth and Fifth Normal Form (4NF)
Both fourth and fifth normal form discuss multivalue dependencies, which in Rails-land are Has and belong to many and has many through.
For example: a skill should not be stored in the same table as a user.
The Rails framework is pretty good at ensuring that these guidelines are followed, the generators automatically create a primary key, if you use the references migration type it will automagically set up the foreign key relation.
These rules are definitely more of a suggestion and in some cases can be ignored if there are good reasons to do so: one example is optimisation. By storing all of a users skills as a serialised array in one field you could speed up a request by removing the need for doing a SQL join.
I glossed over some of the finer details, so if you are interested in finding out more on the subject, a great technical paper to read is A Simple Guide to Five Normal Forms in Relational Database Theory. If you prefer lighter reading the about series is pretty good.