Databases and schemas are used to organize data stored inside tables in Snowflake data warehouse, similar to how relational databases are commonly used by data scientists, analysts, and engineers.
Unlike how Snowflake data platform defines warehouses to mean virtual compute resources that process data, the Snowflake terms database and schema have more traditional definitions.
“A database is a logical grouping of schemas. Each database belongs to a single Snowflake account. A schema is a logical grouping of database objects (tables, views, etc.). Each schema belongs to a single database.” – Snowflake Docs
Together, a database and schema create what is called a “namespace” in Snowflake. Typically, the namespace is inferred automatically while using Snowflake, based on the current database and schema that are in use during your session, but you can specify the namespace manually.
One important feature of Snowflake’s storage system is the ability to create a “share,” a set of database objects (such as schemas that contain tables and secure views) that you would like to share with other Snowflake accounts. Shares are one way Snowflake protects data privacy.
You manipulate any of these organizational pieces of Snowflake’s data storage via Snowflake’s data definition language (DDL), which is based on SQL (Structured Query Language). For a database, schema, or share, the main commands are CREATE, ALTER, DROP, and SHOW.
How Does Snowflake Storage Work?
Under the hood, groups of rows in Snowflake’s storage tables are stored as individual micro-partitions, organized internally as groups of columns. Since Snowflake is meant for big data, a Snowflake table in storage may contain thousands or millions of micro-partitions.
“When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage.” – Snowflake Docs
Snowflake’s unique internal architecture avoids unnecessary scanning of micro-partitions during database queries, which significantly accelerates the performance of queries referring to a given set of columnar data. This process happens automatically without additional configuration.
Unlike other database solutions, data stored inside Snowflake is not accessible directly; it’s only accessible via SQL commands. And, while there’s no limit to the number of databases, schema, or tables you can create, there is a storage cost associated with keeping data inside Snowflake.
If you enjoyed this article on Snowflake analytics, please consider signing up for our free email newsletter, following our Twitter account @propeldatacloud, or joining our customer waiting list to stay in touch with the latest news and education from Propel Data.