This past June, Ethan Cha, Senior Machine Learning Ops Engineer, was selected to present at the 2023 Snowflake Summit in Las Vegas, Nevada. We sat down with him post-conference to get the inside scoop on his experience.
Can you walk us through a typical day at Snowflake Summit?
My typical day at Snowflake Summit was a marathon of fascinating sessions and power networking—ending with a happy hour! The conference agenda was full of exciting new releases from the Snowflake team and use-case specific sessions from Snowflake partners and consumers. A highlight for me was the happy hour hosted by a forward-thinking venture capital firm, where I was able to mingle with industry trailblazers: investors, founders, product managers and engineers. The agenda was brimming with insightful breakout discussions about the future of AI and data science, including a particularly intriguing one about bridging the knowledge gap in AI.
That sounds incredible! What was the most interesting session you attended and why?
I really enjoyed Introduction to Iceberg, led by Ryan Blue, the creator of Apache Iceberg about its integration with Snowflake. The conversation drilled down into how Snowflake's adoption of Iceberg could revolutionize the work of data engineers through improved data lineage, version control and auditing through efficient management and usage of table metadata. The prospect of more efficient table management and faster query times has huge potential to minimize downtime and maximize productivity.
Tell us about your speaking session. What did you present?
I had the opportunity to present an in-house built machine learning monitoring solution (MLeX) powered by Streamlit and Snowpark to around 20+ data and machine learning professionals. MLeX centralizes all the monitoring of models in production and serves as a tool for informed, proactive decision making. It is designed with transparency, interpretability and standardization of models in mind.
The main features of MLeX are:
- Drift detection: MLeX detects differences in data distribution between the training and inference dataset. This can be used to trigger a re-training of the model, or to adjust the model's parameters to compensate for the drift.
- Model performance tracking: MLeX tracks the technical performance metrics of the model in inference. This can be used to identify problems with the model, such as overfitting or underfitting. It can also be used to track the model's performance over time.
- Model explainability: MLeX analyzes and understands the predictions outputted by the ML model. This is important for ensuring that the model is making fair and unbiased predictions. It can be used to identify the factors that are most important to the model's predictions and to understand how the model is making its decisions.
I was eager to share as many key takeaways and learnings from developing MLeX as possible. I touched on the business need for an ML monitoring solution, the technical components that power the solution and the design decisions that were incorporated into building the solution. I then demonstrated the tool to the audience. I am grateful for the opportunity to present MLeX to this group of professionals!
What is one thing you learned at Snowflake Summit that you will bring back to your team at Cedar?
At Cedar, we are no stranger to the rigors of examining and integrating new technology into our workflows. But after learning more about Snowflake's transition from a data warehouse solution to a data cloud, I'm excited about the slew of updates including Snowpark, Streamlit, container services and Snowflake CICD. It's always a win when you can bring home knowledge that not only sparks excitement, but also has secure and practical on-the-ground application.
Lastly, what’s one piece of advice you would give to someone attending Snowflake Summit for the first time? How can they make the most of their experience?
My golden rule for conference first-timers is: get to know the lay of the land. At the Snowflake Summit, I learned first-hand that the conference venues (Caesar’s Palace and Caesar’s Forum) were much further apart than I'd reckoned. As a result, I had to hustle between sessions, sometimes missing out due to full capacity 😢. So, for the love of all things data, plan your itinerary with the conference map in mind!
Ethan, thank you so much for sharing your experiences with us. Where can readers learn more about the summit?
Anytime! You can catch all of the 2023 Snowflake Summit Sessions here + sign up for next year’s summit in San Francisco here.
To learn more about Ethan Cha, connect with him on LinkedIn. Make sure to check out Ethan’s article on the blog - Machine Learning Infrastructure at Cedar.
To learn more about Cedar, visit our website here. Interested in a career at Cedar? Check out our careers page.