Thank you. This is a nice summary of Netflix’s data stack. Can you say anything further as to why Netflix uses AWS Redshift at all? Was their original data warehouse before they adopted lakehouse on Iceberg? Or how do they decide to use Redshift over Iceberg/Trino/Spark? In their ecosystem, do consumers such as Tableau or data analyst (or whatever) consume all data via Trino? Or do some directly engage it using Redshift?
Overall this summary makes perfect sense to me, with the exception of Redshift — I just don’t get it’s purpose.
I don't have the concrete answer atm, if I find something I will let you know.
I do believe since Netflix is big they still have duplicate tools, and some systems still rely on those. Redshift is probably older one and they adopted lakehouse later.
Thank you. This is a nice summary of Netflix’s data stack. Can you say anything further as to why Netflix uses AWS Redshift at all? Was their original data warehouse before they adopted lakehouse on Iceberg? Or how do they decide to use Redshift over Iceberg/Trino/Spark? In their ecosystem, do consumers such as Tableau or data analyst (or whatever) consume all data via Trino? Or do some directly engage it using Redshift?
Overall this summary makes perfect sense to me, with the exception of Redshift — I just don’t get it’s purpose.
Hey Kent,
Thanks for the comment.
I don't have the concrete answer atm, if I find something I will let you know.
I do believe since Netflix is big they still have duplicate tools, and some systems still rely on those. Redshift is probably older one and they adopted lakehouse later.
a great analysis... .thank you. Curious as to their security stack and how they reduce their attack vectors. Could you shed some light on that please?
Hey, I don't no about security as thats something I have not read about.