Cleaning up directory
This commit is contained in:
64
equinix/api-team/proposals/ruby3-upgrades.org
Normal file
64
equinix/api-team/proposals/ruby3-upgrades.org
Normal file
@@ -0,0 +1,64 @@
|
||||
#+TITLE: Ruby 3 Upgrades
|
||||
#+AUTHOR: Adam Mohammed
|
||||
#+DATE: May 10, 2023
|
||||
|
||||
|
||||
* Agenda
|
||||
- Recap: API deployment architecture
|
||||
- Lessons from the Rails 6.0/6.1 upgrade
|
||||
- Defining key performance indicators
|
||||
|
||||
* Recap: API Deployment
|
||||
|
||||
The API deployment consists of:
|
||||
- **frontend pods** - 10 Pods dedicated to serving HTTP traffic
|
||||
- **worker pods** - 8 pods dedicated to job processing
|
||||
- **cron jobs** - various rake tasks executed to perform periodic upkeep necessary for the APIcontext
|
||||
|
||||
** Release Candidate Deployment Strategy
|
||||
|
||||
This is a form of a canary deployment strategy. This strategy involves
|
||||
diverting just a small amount of traffic to the new version, while
|
||||
looking for an increased error rate. After some time, we assess how
|
||||
the candidate has been performing. If things look bad, then we scale
|
||||
back and address the issues. Otherwise we ramp up the amount of
|
||||
traffic that the pods see.
|
||||
|
||||
Doing things this way allows us to build confidence in the release but
|
||||
it does not come without drawbacks. The most important thing to be
|
||||
aware of is that we're relying on the k8s service to load balance
|
||||
between the two versions of the application. That means that we're not
|
||||
doing any tricks to make sure that a customer is only ever hitting a
|
||||
single app version.
|
||||
|
||||
We accept this risk because issues with HTTP requests are mostly
|
||||
confined to the request and each span stamps the rails version that
|
||||
processed that portion of the request.
|
||||
|
||||
Some HTTP requests are not completed completely at the
|
||||
request/response time. For these endpoints, we queue up background
|
||||
jobs that the workers eventually process. This means that some
|
||||
requests will be processed by the release candidate, and the
|
||||
background job will be processed by the older application version.
|
||||
|
||||
Because of this, when using this release strategy, we're assuming that
|
||||
the two versions are compatible, and can run side-by-side.
|
||||
|
||||
|
||||
* Lessons from Previous Rails Upgrades
|
||||
|
||||
|
||||
|
||||
|
||||
* Defining key performance indicators
|
||||
|
||||
Typically, what I would do (and what I assume Lucas does) is just keep an eye on Rollbar. Rollbar would capture things that are at least fundamentally broken that would cause exceptions or errors in Rails. Additionally, I would keep a broad view on errors by span kind in honeycomb to see if we were seeing a spike associated with the release candidate.
|
||||
|
||||
- What we were looking at in the previous releases
|
||||
- Error rates by span kind per version
|
||||
|
||||
This helps us know if the error rate for requests is higher in one version or the other. Or if we're failing specifically in proccessing background jobs.
|
||||
|
||||
- No surprises in Rollbar
|
||||
|
||||
Instead, ideally we'd be tracking some information the system reports that are stable.
|
||||
Reference in New Issue
Block a user