SQL Tutorial - Database design

Understanding Database Design: A Comprehensive Guide

Database design determines how data is logically stored, and it has a significant impact on how the database will be queried, whether for reading data or updating data. It's crucial to understand database design concepts to ensure efficient and effective data management.

Database Models and Schemas

---------------------------

Database models are high-level specifications for database structure, while schemas are the implementation of the database model. The relational model is the most popular model used to create relational databases. In this model, rows are defined as records, and columns are defined as attributes. The relational model also calls for rules such as each row having unique keys.

There are other models that exist beyond the relational model, which do not enforce the same rules. Schemas, on the other hand, take the logical structure more granularly by defining specific tables, fields, relationships, indices, and views. It's essential to respect schemas when inserting structured data into a relational database.

The first step in database design is data modeling, which is the abstract design phase where we define a data model for the data to be stored. There are three levels to a data model:

* Conceptual data model: This level describes what the database contains, such as its entities, relationships, and attributes.

* Logical data model: This level decides how the entities and relationships map to tables.

* Physical data model: This level looks at how data will be physically stored at the lowest level of abstraction.

These three levels ensure consistency and provide a plan for implementation and use. For example, let's consider storing songs in a database. The entities are songs, albums, and artists, with various attributes denoted by pink rhombuses. The relationships between these entities are denoted by blue rhombuses.

Here is an example of a conceptual idea of the data to be stored:

The entities are songs, albums, and artists with various attributes. Their relationships are denoted by blue rhombuses. This is a simplified example of where we want to store songs in this case the entities are songs albums and artists with various pink attributes their relationships are denoted by blue rhombuses here we have a conceptual idea of the data we want to store.

Here is a corresponding schema using the relational model, which translates the entities into tables:

The fastest way to create a schema is to translate the entities into tables. However, this doesn't mean it's the best approach. Let's look at some other ways this ER diagram could be converted for example you could opt to have one table because you don't want to have to run so many joins to get song information or you could add tables for genre and label many songs share these attributes and having one place for them helps with data integrity.

The biggest difference here is how the tables are determined. There are different pros and cons to these three examples, which will be expanded upon in the next chapter on normalization and D normalization.

Dimensional Modeling

-------------------

Dimensional modeling is an adaptation of the relational model specifically for data warehouses. It's optimized for OLAP type queries that aim to analyze rather than update. To achieve this, it uses a star schema, which will be explored further in the next chapter.

In a dimensional model, there are two types of tables: fact and dimension tables. The fact table holds records of a key metric, while also holding foreign keys to dimension tables. Dimension tables hold descriptions of specific attributes that do not change often. In our example, the turquoise table is a fact table called songs it contains foreign keys to purple dimension tables.

These dimension tables expand on the attributes of a fact table such as the album it's in and the artist who made it the record in fact tables often change as new songs get inserted albums labels artists and genres will be shared by more than one songs hence records and dimension tables won't change as much.

Summing up, to decide the fact table in a dimensional model consider what is being analyzed and how often entities changed.

Let's do some X

"WEBVTTKind: captionsLanguage: enlet's learn more about what database design means database design determines how data is logically stored this is crucial because it affects how the database will be queried whether for reading data or updating data there are two important concepts to know when it comes to database design database models and schemas database models are high-level specifications for database structure the relational model which is the most popular is the model used to make relational databases it defines rows as records and columns as attributes it calls for rules such as each row having unique keys there are other models that exist that do not enforce the same rules a schema is a databases blueprint in other words the implementation of the database model it takes the logical structure more granularly by defining the specific tables fields relationships indices and views a database will have schemas must be respected when inserting structured data into a relational database the first step to database design is data modelling this is the abstract design phase where we define a data model for the data to be stored there are three levels to a data model a conceptual data model describes what the database contains such as its entities relationships and attributes a logical data model decides how these entities and relationships map to tables a physical data model looks at how data will be physically stored at the lowest level abstraction these three levels of a data model ensure consistency and provide a plan for implementation and use here is a simplified example of where we want to store songs in this case the entities are songs albums and artists with various pink attributes their relationships are denoted by blue rhombuses here we have a conceptual idea of the data we want to store here is a corresponding schema using the relational model the fastest way to create a schema is to translate the entities into tables but just because it's the easiest doesn't mean it's the best let's look at some other ways this er diagram could be converted for example you could opt to have one table because you don't want to have to run so many joins to get song information or you could add tables for genre and label many songs share these attributes and having one place for them helps with data integrity the biggest difference here is how the tables are determined there are different pros and cons to these three examples I've shown the next chapter on normalization and D normalization will expand on this from the prerequisites you should be familiar with the relational model dimensional modeling is an adaptation of the relational model specifically for data warehouses it's optimized for OLAP type of queries that aim to analyze rather than update to do this it uses the star schema in the next chapter we'll delve into that more as we will see in the next slide the schema of a dimensional model tends to be easy to interpret and extend this is a big plus for analysts working on the warehouse dimensional models are made up of two types of tables fact and dimension tables what the fact table holds is decided by the business use case it contains records of a key metric and this metric change is often fact tables also hold foreign keys to dimension tables dimension tables hold descriptions of specific attributes and these do not change as often so what does that mean let's bring back our example when we're we're in analyzing songs the turquoise table is a fact table called songs it contains foreign keys to purple dimension tables these dimension tables expand on the attributes of a fact table such as the album it's in and the artist who made it the record in fact tables often change as new songs get inserted albums labels artists and genres will be shared by more than one songs hence records and dimension tables won't change as much summing it up to decide the fact table in a dimensional model consider what is being analyzed and how often entities changed let's do some Xlet's learn more about what database design means database design determines how data is logically stored this is crucial because it affects how the database will be queried whether for reading data or updating data there are two important concepts to know when it comes to database design database models and schemas database models are high-level specifications for database structure the relational model which is the most popular is the model used to make relational databases it defines rows as records and columns as attributes it calls for rules such as each row having unique keys there are other models that exist that do not enforce the same rules a schema is a databases blueprint in other words the implementation of the database model it takes the logical structure more granularly by defining the specific tables fields relationships indices and views a database will have schemas must be respected when inserting structured data into a relational database the first step to database design is data modelling this is the abstract design phase where we define a data model for the data to be stored there are three levels to a data model a conceptual data model describes what the database contains such as its entities relationships and attributes a logical data model decides how these entities and relationships map to tables a physical data model looks at how data will be physically stored at the lowest level abstraction these three levels of a data model ensure consistency and provide a plan for implementation and use here is a simplified example of where we want to store songs in this case the entities are songs albums and artists with various pink attributes their relationships are denoted by blue rhombuses here we have a conceptual idea of the data we want to store here is a corresponding schema using the relational model the fastest way to create a schema is to translate the entities into tables but just because it's the easiest doesn't mean it's the best let's look at some other ways this er diagram could be converted for example you could opt to have one table because you don't want to have to run so many joins to get song information or you could add tables for genre and label many songs share these attributes and having one place for them helps with data integrity the biggest difference here is how the tables are determined there are different pros and cons to these three examples I've shown the next chapter on normalization and D normalization will expand on this from the prerequisites you should be familiar with the relational model dimensional modeling is an adaptation of the relational model specifically for data warehouses it's optimized for OLAP type of queries that aim to analyze rather than update to do this it uses the star schema in the next chapter we'll delve into that more as we will see in the next slide the schema of a dimensional model tends to be easy to interpret and extend this is a big plus for analysts working on the warehouse dimensional models are made up of two types of tables fact and dimension tables what the fact table holds is decided by the business use case it contains records of a key metric and this metric change is often fact tables also hold foreign keys to dimension tables dimension tables hold descriptions of specific attributes and these do not change as often so what does that mean let's bring back our example when we're we're in analyzing songs the turquoise table is a fact table called songs it contains foreign keys to purple dimension tables these dimension tables expand on the attributes of a fact table such as the album it's in and the artist who made it the record in fact tables often change as new songs get inserted albums labels artists and genres will be shared by more than one songs hence records and dimension tables won't change as much summing it up to decide the fact table in a dimensional model consider what is being analyzed and how often entities changed let's do some X\n"