Scripts Home |
The Structured Query Language (SQL) is the common language of most database software such as MySql, Postgresql, Oracle, DB2, etc. This language translates the relational theory into practice but imperfectly, SQL is a language that is a loose implementation of relational theory and has been further modified in its actual implementation by the Relational Database Management System (RDBMS) software that uses it.
There is lots of literature and discussion of relational theory and its application in product and software design. Authors like Chris Date, Hugh Darwen and Fabian Pascal write extensively on the topic at http://www.dbdebunk.com/index.html, they are relational theorists that advocate strict standards in design and implementation and have written lots of books on relational databases and design.
This article is just an introduction to relational databases and some of the gyrations that SQL has done to implement (and not) some of the operations possible under this theory, not a detailed critique or explanation of relational database theory and its application to project design, but familiarity with the concepts of relational theory is needed by anyone who uses a database in the design of a product or software project.
Relational algebra and relational calculus are the mathematical basis for relational databases developed by E.F. Codd. I would describe it as a kind of set theory that gives a solid provable framework for software design that involves lots of data that must be managed. If the project you are looking at uses a database then these ideas should be looked at and considered carefully for the design.
The application of relational theory usually contains the point of view that the data is used by more than one application. As a result, architecture and design decisions are made that optimize the organization of the data for use by many applications, not a specific physical optimization of a data element for one application.
An example could be a company that manufactures and sells something is a collection of applications organized around the data that the company needs to run itself. Customer data, employee data, inventory data, manufacturing process data, shipping data, invoice data, general ledger data, supplier data, etc., are all related in a logical design, implemented in a physical design using the RDBMS and then used by many applications such as a web store, a point of sale application, an inventory application, tracking employee hours and projects, accounts, payroll, etc.
As I mentioned at the beginning, the SQL standards that exist, such as SQL-92 or SQL-99 are not followed exactly in any RDBMS software. Well, here we are, an imperfect implementation of relational theory is set in a language standard that is not followed. Since SQL is the common tool that is the interface to most databases, we must try to use it.
Here are some examples of that imperfect language, SQL, trying to do the relational algebra operations of difference, simple division and partition in MySql and Postgresql.
Which records in TABLE_A do not share A_KEY in TABLE_B?
select * from TABLE_A where A_KEY not in (select A_KEY from TABLE_B)
With the SQL-92 Standards keyword 'EXCEPT' Follow the same rules as the keyword 'UNION'
select * from TABLE_A EXCEPT select * from TABLE_B also seen as: select * from TABLE_A MINUS select * from TABLE_B
In a many-to-many relationship there are three tables, A, B, C with C as the table representing the many-to-many key pairs of A and B.
For simple division: What are the 'A_KEY's to which all 'B_KEY's belong?
select distinct A_KEY from TABLE_C C where not exists ( select B_KEY from TABLE_B B where not exists ( select * from TABLE_C CC where A.A_KEY = CC.A_KEY and B.B_KEY = CC.B_KEY ))
What are the records of 'TABLE_A' that have the top ten values of 'ATTRIBUTE'?
select * from TABLE_A A, TABLE_A B where A.ATTRIBUTE <= B.ATTRIBUTE group by KEY, ATTRIBUTE having count(*) <= 10 order by ATTRIBUTE
Or another SQL hack at partition:
select * from TABLE_A A where 10 >= (select count(*) from TABLE_A B where B.ATTRIBUTE >= A.ATTRIBUTE)
What are the records of 'TABLE_A' that share a 'KEY' with records of 'TABLE_B'?
select * from TABLE_A where TABLE_A.KEY in (select TABLE_B.KEY from TABLE_B)
Or if 'intersect' is implemented it follows the same rules as 'union.'
select distinct * from depositor intersect select distinct * from borrower