Serverless AWS Migration
Migrating to a serverless architecture helped this AI platform improve its responsiveness to customer requests and lower costs by 70%
Cyrano.ai is a leading-edge Natural Language Processing AI solution that uses language to understand people and offer advice for building effective relationships with those people. Its platform was initially developed on AWS using expensive high-end instances. While the technology is advanced, the technical debt mounted and was beginning to slow innovation and impede Cyrano’s ability to execute.
Bringing a Customer Focus to Technology
A pipeline of interested customers were backed up behind feature requests that remained stuck in the backlog behind fixes to improve stability – but many of the proposed fixes were band-aids and did not guarantee a long-term solution. Ryan Huff, a co-founder of Cambium, discussed the potential benefits of a migrating to a serverless architecture on AWS for the Cyrano platform.
Serverless has several benefits, specifically:
- Reduced compute costs. You only incur costs for execution time and the amount of memory allocated to a function.
- No servers to manage. Functions are invoked in sandboxed containers that are configured for a specific runtime and abstract away the particulars of the underlying OS.
- Highly scalable. Functions are invoked as requests come in. Many safeguards are available to configure the number of concurrent requests should be allowed to run.
- Flexible deployment. Updates to individual functions can be deployed without changing the rest of the environment.
But before proposing a formal solution, we wanted to understand where the biggest scalability and cost issues were.
API usage was not tracked at the detail of individual endpoints, so we created a short survey to send to the customers with the highest overall utilization. The survey showed us that there were 3 specific API endpoints that accounted for over 80% of platform usage, and an analysis of server logs confirmed the results.
We calculated the cost-per-call for each of the 3 API endpoints by taking the average operation time and looking at server instance costs, for all instances that serve the API. For this exercise, we did not include in these calculations the cost of the database instances, since we were anticipating keeping the database on the existing RDS instances as we were not migrating the entire API immediately. We determined that if we moved these top 3 APIs to AWS Lambda, we could reduce the size and number of EC2 instances and reduce EC2 costs by 75%.
With an understanding of the highest-used APIs and their costs, we began considering a potential serverless architecture, prioritizing the endpoints in order of the highest cost/benefit. We estimated AWS Lambda costs by determining the number of API requests per endpoint and assuming that API execution time would be the same as on EC2. The estimated compute savings of this approach was nearly 90%.
Simplified Serverless Architecture for Cyrano.ai
The Phase I design of this migration was simple and straight-forward:
- API Gateway. handled API routing, including proxying to EC2 for APIs that weren’t migrated in Phase I.
- Lambda. Individual functions for each of the migrated endpoints. Node.js runtime.
- RDS. No data migration was involved in Phase 1. Lambda functions connected to RDS and utilized the RDS proxy to prevent maxing out database resources during periods of high volume utilization.
- Continuous Integration. We used simple Github Actions to automate code deployment to Lambda.
Security Considerations
Before finalizing this architecture, we needed to ensure that our platform -- and all cloud resources -- were as secure as possible. We always strive to follow best practices and ensure that the architecture takes these considerations in place. We chose to work with a team of AWS security specialists in this case to audit our infrastructure. The audit involved in-depth AWS configuration scans.
Later, Cyrano worked with Leviathan Security to conduct automated penetration testing on publicly-accessible resources followed by manual security testing. All of this came at a significant expense but, as Cyrano needed to stand up to scrutiny with their customers' enterprise security teams, was a worthwhile investment that positions Cyrano to confidently earn their customers' trust.
Iterating and Completing the Migration
Once customers were migrated to the new API host (pointing to API Gateway and not the EC2 load balancer) we began the process of moving off of EC2 entirely. This involved migrating not just the remaining API endpoints, but also the worker processes we had running on EC2 as well. Lambda is ideal for running asynchronous workloads and integrates seamlessly with many other AWS services including SQS, S3, DynamoDB, and EventBridge.
We also incorporated different storage solutions that were more appropriate for specific needs than using RDS for all data. We were able to migrate large, infrequently accessed blobs to S3, and cache certain responses using DynamoDB. Once we were able to turn off the remaining EC2 instances, our overall AWS expenses dropped by over 70%. Additionally, even after completing just the initial migration of the highest-utilized APIs, the Cyrano team was able to finally focus its attention on revenue-generating requests from customers.
Future iterations include migrating the certain APIs to AWS’s GraphQL solution, AppSync.