SQL Style Guide
Kai Hofmann powerstat@web.de
08.05.2003
Style guide background
This style guide describes how you should name and design the database for your application module.
The base idea behind all this is to improve the usability, performance, maintainance and understandability of you database part and application.
This guide tries to be database independent. So using it should make your scheme more portable to different dbms.
Naming tables and columns
Tables and columns should always be named in
- lower case
- english language
and a name should not use more than 32 characters.
Each table name should describe the tables content in one or two words - use the plural form.
For shared hosts with only one database available there should be a application specific prefix for each table name to avoid name collisions when using multiple applications within this one database.
Examples:
events
accounts
bookmarks
servers
timetrack_jobs
log_events
appname_events
appname_accounts
appname_servers
Each column of a table must start with a shortened version of the table name with a maximum of 6 characters (don't use the tables application prefix here).
For example use:
- msg
- for messages
- cal
- for calendars
- adr
- for addresses
- conf
- for configurations
- cat
- for categories
- log
- for log entries
- proj
- for projects
- cont
- for contacts
- inv
- for invoices/inventories
Please be careful to use each label only in one table.
So if you have an invoices table as well as an inventory table use inv for the invoices and invt for the inventory table.
After the label you should use "_" and then the singular form of a subject that best describes the column:
- cal_date
- table calendar colum with a date
- cal_name
- table calendar column with calendar event name
- cal_description
- table calendar column with event description
For long fieldnames like description you can use short ones like "descr".
Try using the same vocabulary in all tables.
This nameing scheme will also shorten your select statemtens when using multiple tables, because the column names are more
unique and so you only need to use the table name for the primary/foreign key where condition!
Using datatypes
Datatypes for the columns should be used wisely and with care.
Try to use the datatype that fits best and avoid the usage of blob datatypes, because they are NOT portable between different dbms. A varchar has a maximum length of 255 characters. When using DATETIME (prefered over DATE and TIME because DATETIME is supported by more dbms) you should read/write to this datatype in ISO8601 format [4] "yyyy-mm-ddThh:mm:ss".
languages | DT |
---|---|
lang_code | CHAR(2) |
lang_name | VARCHAR(32) |
Foreign keys for relations must always have the same datatype as the primary of the referenced table.
Defining primary keys
For primary keys always use an integer datatype (for performance reasons).
Only use another type when this is a real good alternative like for a languages table when using the iso 2 character code as primary key (like in chapter 3).
Primary keys should always be named as "id" if possible (including the tables short name):
cal_id
event_id
msg_id
Except for good reasons like in the language example from chapter 3.
Relations and foreign keys
Relations between tables should be used whereever possible.
A database is optimized for handling complex data structures - so let the database do its job instead of rewriting the whole thing within your application.
Foreign keys should be "copied" from the parent table with the same name and the same datatype.
Using different datatypes for primary and foreign key will not work on all database systems and also it will cost a lot of performance.
So when referencing the languages table from chapter 3 your messages table should look as follows:
messages | DT |
---|---|
msg_id | INT |
msg_subject | VARCHAR(80) |
msg_body | VARCHAR(255) |
lang_code | CHAR(2) |
this allows an easier identification of the referenced tables.
A special trick that is very useful sometime are self referencing tables (for example nested categories).
In such a case you must rename the foreign key because it can not have the same name as the primary key.
categories | DT | Comment |
---|---|---|
cat_id | INT | |
cat_name | VARCHAR(32) | |
cat_parent_id | INT | (reference to cat_id) |
Defining indexes
Indexes will speed up the processing of select statements when they are set correctly.
In database systems without real "foreign keys/relations" (like MySQL 3.x) you should create an index for each key that references another tables.
Otherwise MySQL will do a cost intensive full table scan each time.
You also should create indexes for often used select statements.
3rd normal form (3NF)
For a good database design you should generally use the 3rd normal form.
This will avoid double data within your database and has many more advantages.
For more about database design and the 3rd normal form please read [2], [3].
Interesting online resources can be found under [5].
Database access from the application
Within your application you should NOT try to do the databases job - for example don't associate the data from two tables with each other within your application - the database is always faster and better in doing this job.
Try to move your database access code into extra classes or as a minimum move it to an extra method. This will allow better maintanance later in the lifecycle of your software.
Write a SQL Statement into one line and don't split it into several lines. This will allow searching via grep.
Use dynamic parts (variables) only within the WHERE
clause.
Write the SQL-Keywords in uppercase - all other things in lowercase.
Avoid the *
as in SELECT * FROM table
better use the column names you really need to avoid confusion and speed up the processing.
You should always use ANSI-SQL standard statements within your application. When you have no choice of using database specific code then place it into an extra file wich should be named by the databases name.
How to write select statements with multiple tables
There are several ways how to join tables via a select statement.
Use the following form:
SELECT evts_name,cal_date FROM events,languages,calendar WHERE event.lang_code = languages.lang_code AND event.cal_id = calendar.cal_id
SELECT evts_name,cal_date FROM events LEFT JOIN languages LEFT JOIN calendar WHERE event.lang_code = languages.lang_code AND event.cal_id = calendar.cal_id
That's because this code is more portable between database systems and the dbms will convert other forms into this one.
Also this form is shorter, more readable, performs better (because the dbms must not transform it) and is less error prone.
Please do NOT use the following version, because it is not so readable (only when using brackets) and more error prone, because multiple joins are only readable when splitting them into several lines.
Expanding an SQL-statement is not as simple as the above form.
SELECT evts_name FROM events JOIN languages ON (event.lang_code = languages.lang_code) JOIN calendar ON event.cal_id = calendar.cal_id
Connections to other sources like files, ldap and imap
Instead of using blobs (which are not portable and very cost intensive) you should save large data as files and reference these files by filename and/or relative path from a varchar column.
Referencing emails via IMAP etc. would work via the unique message id.
For LDAP you can use the DN (distinguished name) as long as it is not larger than 255 characters (varchar limit).
Otherwise it might be better to work with UIDs or add your own unique identifier to each object for referencing it from the database.
Another way of referencing external resources are URIs - for more please take a look at [6].
References
Bibliography
-
A Guide to the SQL Standard.
A user's guide to the standard database language SQL.
C. J. Date, Hugh Darwen -
An Introduction to Database Systems
C. J. Date -
Fundamentals of Database Systems.
Ramez Elmasri, Shamkant B. Navathe -
http://www.w3.org/TR/NOTE-datetime
http://www.mcs.vuw.ac.nz/technical/software/SGML/doc/iso8601/ISO8601.html
http://www.cl.cam.ac.uk/~mgk25/iso-time.html -
http://www.devshed.com/Server_Side/MySQL/Normal/Normal1/page4.html
http://support.microsoft.com/?scid=/support/kb/articles/q209/5/34.asp
http://home.earthlink.net/~billkent/Doc/simple5.htm
http://www.databasejournal.com/sqletc/article.php/26861_1428511_4
http://www.sqlmag.com/Articles/Index.cfm?ArticleID=4887&pg=1
http://databases.about.com/library/weekly/aa080501a.htm
http://databases.about.com/library/weekly/aa081901a.htm
http://databases.about.com/library/weekly/aa090201a.htm
http://databases.about.com/library/weekly/aa091601a.htm