LISA18 has ended
Back To Schedule
Wednesday, October 31 • 2:00pm - 2:30pm
Mastering Near-Real-Time Telemetry and Big Data: Invaluable Superpowers for Ordinary SREs

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
One of fundamental requirements for typical SRE team is the capability to have solid operational insights into the holistic state of the supported system. This usually involves collecting, aggregating, correlating, visualizing and reacting on data generated by diverse set of data sources.

As part of this talk I will go through the high level approach, sample data sources, and some implementation details, used by the Netflix Open Connect CDN Reliability engineering team while supporting the infrastructure and services hosted on thousands of physical servers and providing streaming video delivery for hundreds of millions of clients.

I will cover some samples of the usage of Hive, Presto, Spark, Elasticsearch, Tableau, and Netflix developed tools for monitoring, alerting, debugging, long term analysis, planning, etc. I will also speak about the benefits of correlating detailed data from server and client reported telemetry.

For clarity—this is not a talk about how to build, implement, develop or support Big Data or near real time telemetry systems. This is all about how you can use them as a platform and a powerful toolset for making your operations team stronger.


Ivan Ivanov

Ivan is a Senior CDN Reliability Engineer on the Netflix Open Connect team. He has been designing, deploying, supporting, and optimizing online services on a global scale in various operations roles for the last 17+ years. He is focusing on service reliability, availability, scalability... Read More →

Wednesday October 31, 2018 2:00pm - 2:30pm CDT
Legends Ballroom ABC