### What is Snowflake? - Cloud #data-warehouse solution. - #columnar-data-storage. - Use Cases: - #BI - Data Science - #data-ingestion - #data-warehouse - #data-sharing Advantages of cloud #data-warehouse are: - scalability - accessibility - cost-efficient - lower management effort ### Row v.s. Columnar Database ![[row-vs-columnar-storage.png]] #row-data-storage - Organized in rows - When retrieving data, you will be querying complete records. - Quick retrieval of individual records - Inserting, updating, and deleting individual records efficiently. - Transactional operations - #Postges, #MySQL, #Oracle, #Microsoft-SQL-Server #columnar-data-storage - Organized in columns - When retrieving data, you will be querying relevant columns. - Excels at analytical operations as only needs to access and process columns relevant to the operation. - Taking average price of above image will be much more efficient with #columnar-data-storage - #analytics operations - #Snowflake, #Amazon-Redshift, #Google-BigQuery , #Vertica ### Shared-Disk and Shared-Nothing Architecture In #shared-disk architecture, each node shares the same storage. In #shared-nothing architecture, separates nodes from storage. ![[shared-disk-vs-nothing.png]] #shared-nothing architecture outlines the key concept of **Decoupling Storage and Compute**. This ensures: - Data is stored efficiently - Independent data processing - Components operate without independence Benefits: - Enhanced scalability - Faster #data-processing and response - Cost-effective operations ### Layers to Snowflake's architecture 1. Storage Layer - #columnar-data-storage - Optimized - Automatically organizes - Compressed - Automatically reduces - Tables, schemas, #database 2. Compute Layer - Query execution - Virtual Warehouses - Temporary computing resources created when a user submits a query. - Performs filtering, aggregation, etc. - Sizes Small, Medium, Large delegated to the task. 3. Cloud Services Layer - Infrastructure management - Query Optimization - Authentication - Access Control - Security ![[snowflake-architecture.png]] ### Competitors - #Google-BigQuery - #Amazon-Redshift - #databricks - #PostgreSQL ![[snowflake-competitor-comparisson.png]] #databricks allows for the storage of #unstructured data. All offer support of #semi-structured data support - JSON - Avro - Parquet - csv ![[snowflake-competitor-pricing.png]]