Reduction in tickets
Performance assurance with software continuously making resource decisions leading to low application-response time.
Improvement in node density
“The business relies on us to assure the performance of our applications so that our customers have the best experience when they use our digital services. As our business grew, the OpenShift platform had to scale, and we knew there had to be a better way to manage the complexity.”
Analyst Technology Leader of NOC & Web
Operating at Scale Challenges Performance and Resiliency
Rafael Noval, Technology Leader of NOC & Web at SulAmérica, leads the team responsible for the performance of 57 mission-critical applications. These applications comprise approximately 7,000 containers (~3,000 pods) and run on a hybrid mix of Red Hat OpenShift (virtualized and bare metal), as well as Google GKE and Amazon EKS.
Operating Kubernetes at scale is a complex undertaking. Like many organizations, the team relied on monitoring, leveraging OpenShift and other dashboards to manage the platform and their APM tool to monitor all their applications. When there were resource bottlenecks, Noval’s team would have to chase utilization metrics in an attempt to identify the problem. This manual approach left them reacting to performance issues after they occurred, which slowed down the team and the innovation that their investment in Kubernetes was intended to drive. It could not scale with SulAmérica’s business needs. Resiliency was also becoming an issue. When one of their bare metal nodes failed due to an expired host certificate, the team knew there had to be a better way.
“The business relies on us to assure the performance of our applications so that our customers have the best experience when they use our digital services,” says Emerson Freitas, Analyst Technology Leader of NOC & Web. “As our business grew, the OpenShift platform had to scale, and we knew there had to be a better way to manage the complexity.”
When the team automated Turbonomic’s application resource management, it was the beginning of a new and simpler way to operate, while achieving dramatic improvements to application performance. They automated continuous pod placement in their Production environment, as well as intelligent container rightsizing in their Dev environment. By doing so they were able to improve performance and efficiency: as a result of the automation, the team saw a 70% reduction in tickets and an 11% improvement in node density.
Turbonomic Keeps Application Response Times Low During Peak Holiday Demand
Today Turbonomic is assuring the performance of all 57 mission-critical applications, while simplifying operations. Noval, Freitas, and the rest of the team understand that customer experience is essential to SulAmérica’s business. When they think about performance assurance, they feel ownership, not just for uptime, but application response time.
As such, connecting Turbonomic to their APM was a natural next step; it allowed them to see exactly how the automation was keeping response times low, even as demand across applications fluctuated.
SulAmérica's Health Insurance App Seamlessly Performs as COVID-19 Drives Peak Demand
In January and February of 2021, for example, Brazil was in the throes of the COVID-19 pandemic. Their SulAmérica Saude App, which health insurance beneficiaries use to find doctors, get medical advice, and book appointments saw a significant spike in demand. But with Turbonomic dynamically adjusting the resources to meet that demand, application response times for these mission-critical services were kept low.
Above: As the SulAmérica Saude app experiences peak demand (gray line), Turbonomic maintains low response time (blue line).
Left: As the SulAmérica Saude app experiences peak demand (gray line), Turbonomic maintains low response time (blue line).
SulAmérica's Flight Insurance Service Experiences Peak Demand, But Eager Travelers Still Enjoy a Smooth Booking Experience
Another example, occurred in April of the same year, during the Easter Holiday. An application, which provides travel insurance for bookings with one of the largest low-cost airlines in the region, saw a significant spike in demand during the four-day event in Brazil. Again, Turbonomic kept response time low by dynamically and automatically adjusting resources to meet that demand.
Above: As SulAmérica’s travel insurance app experiences peak and fluctuating demand (gray area), Turbonomic maintains low response time (blue line).
Left: SulAmérica’s travel insurance app as it appears in the Turbonomic UI, with full-stack (app to platform to infrastructure) visibility and control.
With Turbonomic, the NOC & Web team ensures that customers have a seamless experience—no matter what—as they transact with SulAmérica.
"We trust Turbonomic’s automation to give our applications exactly what they need to perform, running in the background and keeping response times low. The results have been transformative for our business and our customers,” says Freitas. “It’s also given our team time back to work on new projects for SulAmérica."
Analyst Technology Leader of NOC & Web
“The continuous application performance and automation we achieved with Turbonomic managing OpenShift, we plan to do the same for our on- prem infrastructure, our cloud environments, and any new containerized environments,” says Noval. “Turbonomic will be our standard of control across our hybrid cloud.”
Technology Leader of NOC & Web
Expanding Automation Across Hybrid Cloud
SulAmérica has a hybrid cloud strategy, leveraging multiple cloud providers alongside their on-prem estate based on what their applications require. Some applications are deployed on-prem because they have dependencies on the legacy environment, such as accessing
data that is hosted there. Other applications are leveraging Google or AWS services in which case they are deployed to GKE or EKS. Wherever these mission-critical applications run, they must maintain low response times and the platforms they run on must be resilient.
With the time they have gotten back by automating Turbonomic, the team has been able to focus on strategic projects for the business, the largest being their “Global Load Service” project. With this initiative, they are architecting the platforms to ensure application security and availability across their data centers and cloud providers. If they lose connectivity in the cloud or on-prem, the DR active-active implementation across their hybrid cloud will ensure that their most mission-critical applications will never fail to service their customers. Turbonomic is a critical piece to this hybrid cloud strategy as it will be continuously maintaining low application response times, whether the applications run on OpenShift, GKE, EKS, Nutanix Karbon, or other Kubernetes distributions; or in their traditional environments running on VMware or public cloud IaaS.
As part of this new standard of control, Noval’s team will also be using Turbonomic to dynamically manage resources across the application lifecycle. As applications move from Dev to Staging to Production, they will leverage the intelligent actions to ensure that they get the resources they need to perform from the moment they’re deployed to Dev through to the high-stakes Production environment that directly impacts the customer experience. In addition to the performance benefits, the team will be leveraging Turbonomic’s ability to automatically maintain compliance—Turbonomic ingests Kubernetes node labels and will automatically move pods while accounting for those constraints. And, as the platform scales to support new applications and new lines of business they will use Turbonomic’s planning capabilities as well.
Operationalizing SLO-Driven Performance
SulAmérica’s digital transformation is an on-going journey of modernizing applications to best serve their customers. Not all applications are built the same—likewise, how they are re-architected or re-platformed can be different. But Turbonomic gives the team a unified view of cloud native and traditional applications to ensure that all applications perform continuously through every phase of transformation.
Rafael and his team understand the impact of application response time to the customer experience. They hope to expand Turbonomic automation to the SRE teams, who are focused on meeting specific Service Level Objectives (SLOs). With Turbonomic, SulAmérica can fully leverage the benefits of cloud native application elasticity by having the software dynamically manage resources based on those SLOs.
“Our team’s macro goal is to deliver an application- and SLO-driven hybrid cloud,” says Noval, “Applications will run wherever it best suits the business, and they will continuously perform and delight our customers. Turbonomic is not only helping us specifically operationalize our vision, but also giving us the time back to focus on what strategically impacts SulAmérica’s business.”
“Our team’s macro goal is to deliver an application- and SLO-driven hybrid cloud. Applications will run wherever it best suits the business, and they will continuously perform and delight our customers. Turbonomic is not only helping us specifically operationalize our vision, but also giving us the time back to focus on what strategically impacts SulAmérica’s business.”
Technology Leader of NOC & Web