Some new design docs
This commit is contained in:
34
design/interconnection-models.org
Normal file
34
design/interconnection-models.org
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
#+TITLE: How do Interconnections work for dummiez
|
||||||
|
#+Author: Adam Mohammed
|
||||||
|
|
||||||
|
|
||||||
|
* User Flows
|
||||||
|
|
||||||
|
User starts by making a API call to ~POST
|
||||||
|
/projects/:id/connections~. When they make this request they are able
|
||||||
|
to either able to use a full dedicated port, on which they get the
|
||||||
|
full bandwidth, or they can use a shared port. The dedicated port
|
||||||
|
promises you get the full bandwidth, but is more costly.
|
||||||
|
|
||||||
|
A user is also able to to select whether the connection at metal is
|
||||||
|
the A side or the Z side. If it's the A-side, then Metal does the
|
||||||
|
billing, if it's the Z-side, Fabric takes care of the billing.
|
||||||
|
|
||||||
|
A-side/Z-Side is a telecom terminology, where the A-side is the
|
||||||
|
requester and the Z side is the destination. So in the case of
|
||||||
|
connecting to a CSP, we're concerned with a-side from metal because
|
||||||
|
that means we're making use of Fabric as a service provider to give us
|
||||||
|
connection to the CSP within the metro.
|
||||||
|
|
||||||
|
If we were making z-side connnections, we'd be granting someone else
|
||||||
|
in the DC access to our networks.
|
||||||
|
|
||||||
|
|
||||||
|
* Under the Hood
|
||||||
|
|
||||||
|
when the request comes in we create
|
||||||
|
|
||||||
|
- An interconnection object to represent the request
|
||||||
|
- Virtual Ports
|
||||||
|
- Virtual circuits associated with each port
|
||||||
|
- A service token for each
|
||||||
43
design/metal-fabric-message-bus.org
Normal file
43
design/metal-fabric-message-bus.org
Normal file
@@ -0,0 +1,43 @@
|
|||||||
|
#+TITLE: Metal Event Entrypoint
|
||||||
|
#+AUTHOR: Adam Mohammed
|
||||||
|
|
||||||
|
|
||||||
|
* Problem
|
||||||
|
|
||||||
|
We would like other parts of the company to be able to notify Metal about
|
||||||
|
changes to infrastructure that crosses out of the Metal's business
|
||||||
|
domain. The concrete example here is for Fabric to tell metal about
|
||||||
|
the state of interconnections.
|
||||||
|
|
||||||
|
* Solution
|
||||||
|
|
||||||
|
Metal's API team would like to expose a message bus to receive events
|
||||||
|
from the rest of the organization.
|
||||||
|
|
||||||
|
Metal's API currently sits on top of a RabbitMQ cluster, and we'd like
|
||||||
|
to leverage that infrastructure. There are a couple of problems we
|
||||||
|
need to solve before we can expose the RabbbitMQ cluster.
|
||||||
|
|
||||||
|
1. RabbitMQ is currently only available within the cluster.
|
||||||
|
2. Fabric (and other interested parties) exist outside of Metal
|
||||||
|
firewalls that allow traffic into the K8s clusters.
|
||||||
|
3. We need to limit blast radius if something were to happen on this shared
|
||||||
|
infrastructure, we don't want the main operations on Rabbit that Metal
|
||||||
|
relies on to be impacted.
|
||||||
|
|
||||||
|
|
||||||
|
For 1, the answer is simple expose a path under
|
||||||
|
`api.core-a.ny5.metalkube.net` that points to the rabbit service.
|
||||||
|
|
||||||
|
For 2, we leverage the fact that CF and Akamai are whitelisted to the
|
||||||
|
metal K8s clusters for the domains `api.packet.net` and
|
||||||
|
`api.equinix.com/metal/v1`. This covers getting the cluster exposed to
|
||||||
|
the internet
|
||||||
|
|
||||||
|
For 3, we can make use of RabbitMQ [[https://www.rabbitmq.com/vhosts.html][Virtual Hosts]] to isolate the
|
||||||
|
/foreign/ traffic to that host. This let's us set up separate
|
||||||
|
authentication and authorization policies (such as using Identity-API
|
||||||
|
via [[https://www.rabbitmq.com/oauth2.html][OAuth]] plugin) which are absolutely
|
||||||
|
necessary since now the core infrastructure is on the internet. We are
|
||||||
|
also able to limit resource usage by Vhost to prevent attackers from
|
||||||
|
affecting the core API workload.
|
||||||
26
design/multi-cloud-networking.org
Normal file
26
design/multi-cloud-networking.org
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
Ok, so I met with Sangeetha and Bob from MCNS and I think I have an
|
||||||
|
idea of what needs to happen for our integrated network for us to
|
||||||
|
build things like MCNS and VMaaS.
|
||||||
|
|
||||||
|
First, you just need two things to be able to integrate at the
|
||||||
|
boundaries of Metal and Fabric, you need a VNI and you need a USE
|
||||||
|
port. Metal already has a service which allocates VNIs, so I was
|
||||||
|
wondering why Jarrod might not have told MCNS about it. Since VNIs and
|
||||||
|
USE ports are both shared resources that we want a single bookkeeper
|
||||||
|
over, there's only one logical point to do that today, and that's the
|
||||||
|
Metal API.
|
||||||
|
|
||||||
|
In a perfect world though, the Metal API doesn't orchestrate our
|
||||||
|
internal network state so specifically, at least I think. It'd be nice
|
||||||
|
if we could rip out the USE port management from the API and push that
|
||||||
|
down a layer away from the customer facing API. The end result is we
|
||||||
|
have internal services Metal API, MCNs, VMaaS all building on our
|
||||||
|
integrated network, but we still just have a single source of truth
|
||||||
|
for allocating the shared resources.
|
||||||
|
|
||||||
|
Sangeetha got a slice of VNIs and (eventually will have) USE ports for
|
||||||
|
them to build the initial MCNS product, but eventually we'll want to
|
||||||
|
bring those VNIs and ports under control of a single service, so we
|
||||||
|
don't have multiple bookkeeping spots for the same resources.
|
||||||
|
Jarrod's initial plan was to just build that in to the Metal API, but
|
||||||
|
if we can,
|
||||||
122
design/nimf-m2.org
Normal file
122
design/nimf-m2.org
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
#+TITLE: NIMF Milestone 2
|
||||||
|
#+SUBTITLE: Authentication and Authorization
|
||||||
|
#+AUTHOR: Adam Mohammed
|
||||||
|
|
||||||
|
* Overview
|
||||||
|
|
||||||
|
This document discusses the authentication and authorization between Metal
|
||||||
|
and Fabric focussed on the customer's experience. We want to deliver a
|
||||||
|
seamless user experience that allows users to set up connections
|
||||||
|
directly from Metal to any of the Cloud Service Providers(CSPs) they
|
||||||
|
leverage.
|
||||||
|
|
||||||
|
* Authentication
|
||||||
|
|
||||||
|
** Metal
|
||||||
|
|
||||||
|
There are a number of ways to authenticate to Metal, but ultimately it
|
||||||
|
comes down to the mode that the customer wishes to use to access their
|
||||||
|
resources. The main methods are directly as a user signed in to a web
|
||||||
|
portal and directly against the API.
|
||||||
|
|
||||||
|
Portal access is done by having the OAuth flow which lets the browser
|
||||||
|
obtain a JWT that can be used to authenticate against the Metal
|
||||||
|
APIs. It's important to understand that the Portal doesn't make calls
|
||||||
|
as itself on behalf of the user, but the user themselves are making
|
||||||
|
the calls by way of their browser.
|
||||||
|
|
||||||
|
Direct API access is done either through static API keys issued to a
|
||||||
|
user, or a project. Integrations through tooling or libraries built
|
||||||
|
for the language are also provided.
|
||||||
|
** Fabric
|
||||||
|
|
||||||
|
* Authorization
|
||||||
|
|
||||||
|
** Metal
|
||||||
|
|
||||||
|
** Fabric
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Option 4 - Asynchronous Events
|
||||||
|
|
||||||
|
Highlights:
|
||||||
|
- Fabric no longer makes direct calls to Metal, it only announces that the connection is ready
|
||||||
|
- Messages are authenticated with JWT
|
||||||
|
- Metal consumes the events and modifies the state of resources as a controller
|
||||||
|
|
||||||
|
|
||||||
|
Option 5 - Callback/Webhook
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Highlights
|
||||||
|
|
||||||
|
Similar to Option 4, though the infrastructure is provided by Metal
|
||||||
|
|
||||||
|
Fabric instead emits a similarly shaped event that says connections state have changed
|
||||||
|
|
||||||
|
It’s Metal’s responsibiity to consume that and respond accordingly
|
||||||
|
|
||||||
|
Changes Required
|
||||||
|
|
||||||
|
Fabric sends updates to this webhook URL
|
||||||
|
|
||||||
|
Metal consumes messages on that URL and handles them accordingly
|
||||||
|
|
||||||
|
Metal provides way to see current and desired state
|
||||||
|
|
||||||
|
Advantages
|
||||||
|
|
||||||
|
Disadvantages
|
||||||
|
|
||||||
|
* Documents
|
||||||
|
|
||||||
|
** Equinix Interconnections
|
||||||
|
|
||||||
|
Metal provided interconnections early on to give customers access to the
|
||||||
|
network capabilities provided by Fabric and Network Edge.
|
||||||
|
|
||||||
|
There currently two basic types of interconnections, a dedicated
|
||||||
|
interconnection and a shared one. The dedicated version as it sounds
|
||||||
|
uses dedicated port infrastructure that the customer owns. This is
|
||||||
|
often cost prohibitive so interconnections over Equinix owned shared
|
||||||
|
infrastructure fills that space.
|
||||||
|
|
||||||
|
The dedicated interconnection types have relatively simple logic in
|
||||||
|
the API relative to shared interconnections. A dedicated
|
||||||
|
interconnection gives you a layer 2 connection and that's all, the
|
||||||
|
rest is on the customer to manage.
|
||||||
|
|
||||||
|
Shared connections connect metal to other networks either through
|
||||||
|
layer 2 or layer 3.
|
||||||
|
|
||||||
|
Layer 2 interconnections are created using either the
|
||||||
|
=VlanFabricVCCreateInput= or the =SharedPortVCVlanCreateInput=. The
|
||||||
|
former provides the interconnection using service tokens, used by
|
||||||
|
Metal to poll the status of the interconnections. These allowed us to
|
||||||
|
provide customers with connectivity, but a poor experience because if
|
||||||
|
you look at the connection in Fabric, it's not clear how it relates to
|
||||||
|
Metal resources.
|
||||||
|
|
||||||
|
The =SharedPortVCVlanCreateInput= allows Fabric access to the related
|
||||||
|
network resources on the Metal side which means managing these network
|
||||||
|
resources on Fabric is a little bit easier. This type of
|
||||||
|
interconnection did some groundwork to bring our physical and logical
|
||||||
|
networks between Metal and Fabric closer together, but that's mostly
|
||||||
|
invisible to the customer, but enables us to build products on our
|
||||||
|
network infrastructure that weren't previously possible.
|
||||||
|
|
||||||
|
Currently, both methods of creating these interconnections exist,
|
||||||
|
until we can deprecate the =VlanFabricVCCreateInput=. The
|
||||||
|
=SharedPortVCVlanCreateInput= type is only capable of layer 2
|
||||||
|
interconnections to Amazon Web Services. This new input type allows
|
||||||
|
fabric to start supporting more layer 2 connectivity without requiring
|
||||||
|
any work on the Metal side. Once we reach parity with the connection
|
||||||
|
destinations of =VlanFabricVCCreateInput= we can deprecate this input
|
||||||
|
type.
|
||||||
|
|
||||||
|
Layer 3 interconnections are created by passing the
|
||||||
|
=VrfFabricVCCreateInput= to the interconnections endpoint. These
|
||||||
|
isolate customer traffic by routing table instead of through VLAN
|
||||||
|
tags.
|
||||||
Reference in New Issue
Block a user