diff --git a/design/interconnection-models.org b/design/interconnection-models.org new file mode 100644 index 0000000..b4a43db --- /dev/null +++ b/design/interconnection-models.org @@ -0,0 +1,34 @@ +#+TITLE: How do Interconnections work for dummiez +#+Author: Adam Mohammed + + +* User Flows + +User starts by making a API call to ~POST +/projects/:id/connections~. When they make this request they are able +to either able to use a full dedicated port, on which they get the +full bandwidth, or they can use a shared port. The dedicated port +promises you get the full bandwidth, but is more costly. + +A user is also able to to select whether the connection at metal is +the A side or the Z side. If it's the A-side, then Metal does the +billing, if it's the Z-side, Fabric takes care of the billing. + +A-side/Z-Side is a telecom terminology, where the A-side is the +requester and the Z side is the destination. So in the case of +connecting to a CSP, we're concerned with a-side from metal because +that means we're making use of Fabric as a service provider to give us +connection to the CSP within the metro. + +If we were making z-side connnections, we'd be granting someone else +in the DC access to our networks. + + +* Under the Hood + +when the request comes in we create + +- An interconnection object to represent the request +- Virtual Ports +- Virtual circuits associated with each port +- A service token for each diff --git a/design/metal-fabric-message-bus.org b/design/metal-fabric-message-bus.org new file mode 100644 index 0000000..cc3812f --- /dev/null +++ b/design/metal-fabric-message-bus.org @@ -0,0 +1,43 @@ +#+TITLE: Metal Event Entrypoint +#+AUTHOR: Adam Mohammed + + +* Problem + +We would like other parts of the company to be able to notify Metal about +changes to infrastructure that crosses out of the Metal's business +domain. The concrete example here is for Fabric to tell metal about +the state of interconnections. + +* Solution + +Metal's API team would like to expose a message bus to receive events +from the rest of the organization. + +Metal's API currently sits on top of a RabbitMQ cluster, and we'd like +to leverage that infrastructure. There are a couple of problems we +need to solve before we can expose the RabbbitMQ cluster. + +1. RabbitMQ is currently only available within the cluster. +2. Fabric (and other interested parties) exist outside of Metal + firewalls that allow traffic into the K8s clusters. +3. We need to limit blast radius if something were to happen on this shared +infrastructure, we don't want the main operations on Rabbit that Metal +relies on to be impacted. + + +For 1, the answer is simple expose a path under +`api.core-a.ny5.metalkube.net` that points to the rabbit service. + +For 2, we leverage the fact that CF and Akamai are whitelisted to the +metal K8s clusters for the domains `api.packet.net` and +`api.equinix.com/metal/v1`. This covers getting the cluster exposed to +the internet + +For 3, we can make use of RabbitMQ [[https://www.rabbitmq.com/vhosts.html][Virtual Hosts]] to isolate the +/foreign/ traffic to that host. This let's us set up separate +authentication and authorization policies (such as using Identity-API +via [[https://www.rabbitmq.com/oauth2.html][OAuth]] plugin) which are absolutely +necessary since now the core infrastructure is on the internet. We are +also able to limit resource usage by Vhost to prevent attackers from +affecting the core API workload. diff --git a/design/multi-cloud-networking.org b/design/multi-cloud-networking.org new file mode 100644 index 0000000..49a052c --- /dev/null +++ b/design/multi-cloud-networking.org @@ -0,0 +1,26 @@ +Ok, so I met with Sangeetha and Bob from MCNS and I think I have an +idea of what needs to happen for our integrated network for us to +build things like MCNS and VMaaS. + +First, you just need two things to be able to integrate at the +boundaries of Metal and Fabric, you need a VNI and you need a USE +port. Metal already has a service which allocates VNIs, so I was +wondering why Jarrod might not have told MCNS about it. Since VNIs and +USE ports are both shared resources that we want a single bookkeeper +over, there's only one logical point to do that today, and that's the +Metal API. + +In a perfect world though, the Metal API doesn't orchestrate our +internal network state so specifically, at least I think. It'd be nice +if we could rip out the USE port management from the API and push that +down a layer away from the customer facing API. The end result is we +have internal services Metal API, MCNs, VMaaS all building on our +integrated network, but we still just have a single source of truth +for allocating the shared resources. + +Sangeetha got a slice of VNIs and (eventually will have) USE ports for +them to build the initial MCNS product, but eventually we'll want to +bring those VNIs and ports under control of a single service, so we +don't have multiple bookkeeping spots for the same resources. +Jarrod's initial plan was to just build that in to the Metal API, but +if we can, diff --git a/design/nimf-m2.org b/design/nimf-m2.org new file mode 100644 index 0000000..500f39f --- /dev/null +++ b/design/nimf-m2.org @@ -0,0 +1,122 @@ +#+TITLE: NIMF Milestone 2 +#+SUBTITLE: Authentication and Authorization +#+AUTHOR: Adam Mohammed + +* Overview + +This document discusses the authentication and authorization between Metal +and Fabric focussed on the customer's experience. We want to deliver a +seamless user experience that allows users to set up connections +directly from Metal to any of the Cloud Service Providers(CSPs) they +leverage. + +* Authentication + +** Metal + +There are a number of ways to authenticate to Metal, but ultimately it +comes down to the mode that the customer wishes to use to access their +resources. The main methods are directly as a user signed in to a web +portal and directly against the API. + +Portal access is done by having the OAuth flow which lets the browser +obtain a JWT that can be used to authenticate against the Metal +APIs. It's important to understand that the Portal doesn't make calls +as itself on behalf of the user, but the user themselves are making +the calls by way of their browser. + +Direct API access is done either through static API keys issued to a +user, or a project. Integrations through tooling or libraries built +for the language are also provided. +** Fabric + +* Authorization + +** Metal + +** Fabric + + + +Option 4 - Asynchronous Events + +Highlights: +- Fabric no longer makes direct calls to Metal, it only announces that the connection is ready +- Messages are authenticated with JWT +- Metal consumes the events and modifies the state of resources as a controller + + +Option 5 - Callback/Webhook + + + +Highlights + + Similar to Option 4, though the infrastructure is provided by Metal + + Fabric instead emits a similarly shaped event that says connections state have changed + + It’s Metal’s responsibiity to consume that and respond accordingly + +Changes Required + + Fabric sends updates to this webhook URL + + Metal consumes messages on that URL and handles them accordingly + + Metal provides way to see current and desired state + +Advantages + +Disadvantages + +* Documents + +** Equinix Interconnections + +Metal provided interconnections early on to give customers access to the +network capabilities provided by Fabric and Network Edge. + +There currently two basic types of interconnections, a dedicated +interconnection and a shared one. The dedicated version as it sounds +uses dedicated port infrastructure that the customer owns. This is +often cost prohibitive so interconnections over Equinix owned shared +infrastructure fills that space. + +The dedicated interconnection types have relatively simple logic in +the API relative to shared interconnections. A dedicated +interconnection gives you a layer 2 connection and that's all, the +rest is on the customer to manage. + +Shared connections connect metal to other networks either through +layer 2 or layer 3. + +Layer 2 interconnections are created using either the +=VlanFabricVCCreateInput= or the =SharedPortVCVlanCreateInput=. The +former provides the interconnection using service tokens, used by +Metal to poll the status of the interconnections. These allowed us to +provide customers with connectivity, but a poor experience because if +you look at the connection in Fabric, it's not clear how it relates to +Metal resources. + +The =SharedPortVCVlanCreateInput= allows Fabric access to the related +network resources on the Metal side which means managing these network +resources on Fabric is a little bit easier. This type of +interconnection did some groundwork to bring our physical and logical +networks between Metal and Fabric closer together, but that's mostly +invisible to the customer, but enables us to build products on our +network infrastructure that weren't previously possible. + +Currently, both methods of creating these interconnections exist, +until we can deprecate the =VlanFabricVCCreateInput=. The +=SharedPortVCVlanCreateInput= type is only capable of layer 2 +interconnections to Amazon Web Services. This new input type allows +fabric to start supporting more layer 2 connectivity without requiring +any work on the Metal side. Once we reach parity with the connection +destinations of =VlanFabricVCCreateInput= we can deprecate this input +type. + +Layer 3 interconnections are created by passing the +=VrfFabricVCCreateInput= to the interconnections endpoint. These +isolate customer traffic by routing table instead of through VLAN +tags.