Geo proxying

With Geo proxying, secondaries now proxy web requests through Workhorse to the primary, so users navigating to the secondary see a read-write UI, and are able to do all operations that they can do on the primary.

Request life cycle

Top-level view

The proxying interaction can be explained at a high level through the following diagram:

clientsecondaryprimaryGET /exploreGET /explore (proxied)HTTP/1.1 200 OK [..]HTTP/1.1 200 OK [..]clientsecondaryprimary

Proxy detection mechanism

To know whether or not it should proxy requests to the primary, and the URL of the primary (as it is stored in the database), Workhorse polls the internal API when Geo is enabled. When proxying should be enabled, the internal API responds with the primary URL and JWT-signed data that is passed on to the primary for every request.

Workhorse (secondary)Internal Rails APIloop[Poll every 10 seconds]GET /api/v4/geo/proxy (internal){geo_proxy_primary_url, geo_proxy_extra_data}, update configWorkhorse (secondary)Internal Rails API

In-depth request flow and local data acceleration compared with proxying

Detailing implementation, Workhorse on the secondary (requested) site decides whether to proxy the data or not. If it can “accelerate” the data type (that is, can serve locally to save a roundtrip request), it returns the data immediately. Otherwise, traffic is sent to the primary’s internal URL, served by Workhorse on the primary exactly as a direct request would. The response is then be proxied back to the user through the secondary Workhorse in the same connection.

Yes
No (proxy)
Client
Workhorse (secondary)
Serve data locally?
Workhorse (primary)

Sign-in

Requests proxied to the primary requiring authorization

ClientSecondaryPrimaryauthentication happens, POST to same URL etcopt[primary not signed in]`/group/project` request1proxy /group/project2302 redirect3proxy 302 redirect4/users/sign_in5proxy /users/sign_in6302 redirect7proxy 302 redirect8/group/project9proxy /group/project10/group/project logged in response (session on primary created)11proxy full response12ClientSecondaryPrimary

Requests requiring a user session on the secondary

At the moment, this flow only applies to Project Replication Details and Design Replication Details in the Geo Admin Area. For more context, see View replication data on the primary site.

ClientSecondaryPrimaryauthentication happens, POST to same URL etcopt[primary not signed in]opt[secondary not signed in]`admin/geo/replication/projects` request1302 redirect2/users/auth/geo/sign_in3302 redirect4/oauth/geo/auth/geo/sign_in5302 redirect6/oauth/authorize7proxy /oauth/authorize8302 redirect9proxy 302 redirect10/users/sign_in11proxy /users/sign_in12302 redirect13proxy 302 redirect14/oauth/geo/callback15302 redirect16admin/geo/replication/projects17admin/geo/replication/projects logged in response (session on both primary and secondary)18ClientSecondaryPrimary

Git pull

For historical reasons, the push_from_secondary path is used to forward a Git pull. There is an issue proposing to rename this route to avoid confusion.

Git pull over HTTP(s)

Accelerated repositories

When a repository exists on the secondary and we detect is up to date with the primary, we serve it directly instead of proxying.

Git client"Workhorse (secondary)""Rails (secondary)""Gitaly (secondary)"decide that the repo is synced and up to dateGET /foo/bar.git/info/refs/?service=git-upload-pack<internal API check>401 Unauthorized<response>GET /foo/bar.git/info/refs/?service=git-upload-pack<internal API check>Render Workhorse OK200 OKPOST /foo/bar.git/git-upload-packGitHttpControllerRender Workhorse OKWorkhorse gets the connection details from Rails, connects to Gitaly: SmartHTTP Service, UploadPack RPC (check the proto for details)Return a stream of Proto messagesPipe messages to the Git clientGit client"Workhorse (secondary)""Rails (secondary)""Gitaly (secondary)"

Proxied repositories

If a requested repository isn’t synced, or we detect is not up to date, the request will be proxied to the primary, in order to get the latest version of the changes.

Git client"Workhorse (secondary)""Rails (secondary)""Workhorse (primary)""Rails (primary)""Gitaly (primary)"decide that the repo is out of dateproxiedGET /foo/bar.git/info/refs/?service=git-upload-pack<response>302 Redirect to /-/push_from_secondary/2/foo/bar.git/info/refs?service=git-upload-pack<response>GET /-/push_from_secondary/2/foo/bar.git/info/refs/?service=git-upload-pack<proxied request><data>401 Unauthorized<proxied response><response>GET /-/push_from_secondary/2/foo/bar.git/info/refs/?service=git-upload-pack<proxied request><data>Render Workhorse OK<proxied response><response>POST /-/push_from_secondary/2/foo/bar.git/git-upload-pack<proxied request>GitHttpControllerRender Workhorse OKWorkhorse gets the connection details from Rails, connects to Gitaly: SmartHTTP Service, UploadPack RPC (check the proto for details)Return a stream of Proto messagesPipe messages to the Git clientReturn piped messages from GitGit client"Workhorse (secondary)""Rails (secondary)""Workhorse (primary)""Rails (primary)""Gitaly (primary)"

Git pull over SSH

As SSH operations go through GitLab Shell instead of Workhorse, they are not proxied through the mechanism used for Workhorse requests. With SSH operations, they are proxied as Git HTTP requests to the primary site by the secondary Rails internal API.

Accelerated repositories

When a repository exists on the secondary and we detect is up to date with the primary, we serve it directly instead of proxying.

Git clientGitLab Shell (secondary)Internal API (secondary Rails)Gitaly (secondary)git pullSSH key validation (api/v4/internal/authorized_keys?key=..)HTTP/1.1 200 OKInfoRefs:UploadPack RPCstream Git response backstream Git response backstream Git data to pushUploadPack RPCstream Git response backstream Git response backGit clientGitLab Shell (secondary)Internal API (secondary Rails)Gitaly (secondary)

Proxied repositories

If a requested repository isn’t synced, or we detect is not up to date, the request will be proxied to the primary, in order to get the latest version of the changes.

Git clientGitLab Shell (secondary)Internal API (secondary Rails)Primary APIgit pullSSH key validation (api/v4/internal/authorized_keys?key=..)HTTP/1.1 300 (custom action status) with {endpoint, msg, primary_repo}POST /api/v4/geo/proxy_git_ssh/info_refs_upload_packPOST $PRIMARY/foo/bar.git/info/refs/?service=git-upload-packHTTP/1.1 200 OK<response>return Git response from primarystream Git data to pushPOST /api/v4/geo/proxy_git_ssh/upload_packPOST $PRIMARY/foo/bar.git/git-upload-packHTTP/1.1 200 OK<response>return Git response from primaryGit clientGitLab Shell (secondary)Internal API (secondary Rails)Primary API

Git push

Git push over SSH

As SSH operations go through GitLab Shell instead of Workhorse, they are not proxied through the mechanism used for Workhorse requests. With SSH operations, they are proxied as Git HTTP requests to the primary site by the secondary Rails internal API.

Git clientGitLab Shell (secondary)Internal API (secondary Rails)Primary APIgit pushSSH key validation (api/v4/internal/authorized_keys?key=..)HTTP/1.1 300 (custom action status) with {endpoint, msg, primary_repo}POST /api/v4/geo/proxy_git_ssh/info_refs_receive_packPOST $PRIMARY/foo/bar.git/info/refs/?service=git-receive-packHTTP/1.1 200 OK<response>return Git response from primarystream Git data to pushPOST /api/v4/geo/proxy_git_ssh/receive_packPOST $PRIMARY/foo/bar.git/git-receive-packHTTP/1.1 200 OK<response>return Git response from primaryGit clientGitLab Shell (secondary)Internal API (secondary Rails)Primary API

Git push over HTTP(S)

Git push over HTTP(S) unified URLs

With unified URLs, a push redirects to a local path formatted as /-/push_from_secondary/$SECONDARY_ID/*. Further requests through this path are proxied to the primary, which will handle the push.

Git clientWorkhorse (secondary)Workhorse (primary)Rails (primary)Gitaly (primary)GET /foo/bar.git/info/refs/?service=git-receive-pack302 Redirect to /-/push_from_secondary/2/foo/bar.git/info/refs?service=git-receive-packGET /-/push_from_secondary/2/foo/bar.git/info/refs/?service=git-receive-pack<proxied request><data>401 Unauthorized<proxied response><response>GET /-/push_from_secondary/2/foo/bar.git/info/refs/?service=git-receive-pack<proxied request><data>Render Workhorse OK<proxied response><response>POST /-/push_from_secondary/2/foo/bar.git/git-receive-pack<proxied request>GitHttpController:git_receive_packRender Workhorse OKGet connection details from Rails and connects to SmartHTTP Service, ReceivePack RPCReturn a stream of Proto messagesPipe messages to the Git clientReturn piped messages from GitGit clientWorkhorse (secondary)Workhorse (primary)Rails (primary)Gitaly (primary)

Git push over HTTP(S) with separate URLs

With separate URLs, the secondary will redirect to a URL formatted like $PRIMARY/-/push_from_secondary/$SECONDARY_ID/*.

Workhorse (secondary)Git clientWorkhorse (primary)Rails (primary)Gitaly (primary)GET $SECONDARY/foo/bar.git/info/refs/?service=git-receive-pack302 Redirect to $PRIMARY/-/push_from_secondary/2/foo/bar.git/info/refs?service=git-receive-packGET $PRIMARY/-/push_from_secondary/2/foo/bar.git/info/refs/?service=git-receive-pack<data>401 Unauthorized<response>GET /-/push_from_secondary/2/foo/bar.git/info/refs/?service=git-receive-pack<data>Render Workhorse OK<response>POST /-/push_from_secondary/2/foo/bar.git/git-receive-packGitHttpController:git_receive_packRender Workhorse OKGet connection details from Rails and connects to SmartHTTP Service, ReceivePack RPCReturn a stream of Proto messagesPipe messages to the Git clientWorkhorse (secondary)Git clientWorkhorse (primary)Rails (primary)Gitaly (primary)