Revolutionize your Data Lakehouse efficiency with lightning-fast analytics using Starburst Galaxy
$1,800.00
August 25 @ 9:00 am – August 27 @ 5:00 pm UTC+5:30
About the event
Join us for a technical deep-dive workshop where you’ll discover how to fully optimize and manage your data lakehouse architecture using Starburst Galaxy. If you’re ready to scale data pipelines and unlock blazing-fast performance, this event is for you.
What You’ll Learn
- Use Starburst Galaxy as a unified access layer for multi-source data environments.
- Construct and query partitioned Hive & Iceberg tables with precision.
- Improve data lakehouse performance using smart file formats and partitioning strategies.
- Leverage Cost-based Optimizer (CBO) and query plan analysis for peak query speed.
- Establish robust role-based access controls for governance and security.
Unlock enterprise-grade scalability and performance for your data lakehouse strategy.
Key Outcomes
- Confidently optimize data lake performance using Starburst
- Implement partitioning, bucketing, and table evolution strategies
- Analyze and optimize query plans for cost-efficiency
- Apply role-based access control for enterprise-grade data governance
Prerequisites
- Solid understanding of SQL
- Familiarity with data warehousing or data lakehouse architecture
Target Audience
- Data Engineers
- Data Architects
- Senior Data Analysts & Scientists
- Cloud Infrastructure Specialists
Starburst features
- Overview
- Architecture
- Web UI
- Connectors & catalogs
- Client tools integrations
Data lake performance
- Foundations and use case
- Limit Data Exchanges
- File format options
- Small files problem
- Partitioning & bucketing
Table Formats
- Moving beyond Hive
- Compare/contrast alternatives
- Explore Delta Lake
Apache Iceberg
- Creating tables
- Insert, update & delete
- CDC with merge
- Schema & partition evolution
- Snapshots & compaction
Parallel Processing
- Divide & conquer
- Beyond single-stage queries
Cost-based optimizer
- Benefits of statistics
- Query plan analysis
Access control
- Configuration options
- Role-based access control
Data Pipelines
- Definition & differentiation
- Reference architecture