Grant Brunner | Extreme Tech
Today, most large online services aren’t hosted on a single server. Amazon, iTunes, and Xbox Live are all run on countless networked servers all over the world. There is a lot of benefit to splitting up the load over many different servers and locations, but cloud computing also has its own problems, such as latency and stability. Top network engineers are working on smoothing out problems as they arise, and Netflix just made a big step in helping cloud services become more resilient.
Announced this week as Hystrix, this system was originally developed by the Netflix API team back in 2011 to control the interactions between Netflix’s distributed services and systems, stepping in to prevent cascading failures if they seem likely. As of today, anyone can use Hystrix completely free because it has been officially released on GitHub under the Apache 2.0 license. The announcement does a good job at showcasing the enormous scale at which this system works: “Today tens of billions of thread-isolated and hundreds of billions of semaphore-isolated calls are executed via Hystrix every day at Netflix and a dramatic improvement in uptime and resilience has been achieved through its use.” This might not sound all that exciting at first, but this could have huge implications for the online services we already use as well as the services of the future.
Previously, large companies needed to hire a bunch of network engineers to develop and maintain a completely new system for managing latency and cascading failure in cloud computing. Now, companies can utilize Hystrix, and have their engineers focus on tailoring it for their specific needs. Because so much of our technology today is being developed by independent companies, there is a lot of needless duplicated work being done. Instead of reinventing the wheel, network engineers can take projects like Hystrix, and build upon them. This is great news for online service companies, but it is even better news for consumers. Better, more reliable services are something to get worked up about. Just a few months ago, Amazon’s cloud service went down, and it took companies like Netflix and Instagram with it. Cloud computing is pretty far from its pinnacle, so having a newly available tool for reliability in everyone’s arsenal is good news.
Netflix isn’t just releasing this code out of the goodness of its heart. Heck, it could even benefit its competitors. So, why release it at all? It’s crowd sourcing of a kind. Additions and improvements other people make to this project can directly be put to use on Netflix’s own servers. This is a superb example of a rising tide floating all boats. By taking this step, Netflix is banking on distributed communal work on the problem of cloud reliability being a better long-term solution than everyone working separately. Let’s hope Netflix’s decision turns out well for them, and we all benefit by having faster and more stable cloud services across the board.