Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-shared-cert ordering (documentation/implementation mismatch) #1309

Open
hermanbanken opened this issue Oct 29, 2020 · 13 comments
Open

pre-shared-cert ordering (documentation/implementation mismatch) #1309

hermanbanken opened this issue Oct 29, 2020 · 13 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@hermanbanken
Copy link

The annotation pre-shared-cert as documented on [1] is supposed to be ordered:

Which certificate is presented?

The load balancer chooses a certificate according to these rules:
[...]
For pre-shared certificates listed in the annotation, the primary certificate is the first certificate in the list.

However, this is not the case. If I apply the YAML [2], then the annotation is rewritten by the controller. The last-applied annotation correctly reflects the order cert-2, cert-1, but the annotation becomes cert-1,cert-2.

This is troublesome, because some client might not support SNI. The order of the certificates in GCP Load Balancers is therefore (critically) important. We switched from Kubernetes-managed certificates to pre-shared-certs to more easily have the same certificates shared by multiple clusters each in their own LB. We would be able to have our Security officer deliver us a new certificate name, which we could apply to all clusters automatically using Kustomize & Flux (GitOps). However, that is impacted by this reordering bug. As a workaround, we'll need to name the primary certificate with a lexicographically-first name.

1 = https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-multi-ssl#which_certificate_is_presented is
2 = Ingress resource

# file: ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    ingress.gcp.kubernetes.io/pre-shared-cert: cert-2, cert-1
    kubernetes.io/ingress.allow-http: "false"
  name: services
spec:
  rules:
  - host: subdomain2.example.org
    http:
      paths:
      - backend:
          serviceName: service2
          servicePort: 80
  - host: subdomain1.example.org
    http:
      paths:
      - backend:
          serviceName: service1
          servicePort: 80
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 26, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rsevat
Copy link

rsevat commented Jun 15, 2023

/reopen
/remove-lifecycle rotten

I can confirm that this is still the case. It gets alphabetically reordered. No matter what order you submit it in. As such my controller was continuously fighting with the controller that kept reordering it. The documentation that describes that the order matters is correct, the first certificate on the list does get shown as primary certificate if there is no SNI match.

Due to this bug the only way we can currently control which certificate gets shown is if we need to use self managed certificates where we can choose what the name is of the certificate and via that we can ensure that it's alphabetically either before or after the google managed certificates.

For example:

We use a single google managed certificate with a random generated name by google: mcrt-780449c3-edc2-4989

With multiple subject alternative names:

app.abc.org
app2.abc.org

Now we want to add app3.abc.org, but that domain is live in our old datacenter. To migrate it to GCP we'll add a wildcard certificate named: wildcard-abc-org to our ingress with as domain name: *.abc.org. So that we have a 0 downtime migration while the new google managed certificate gets created in the background.

We add it to our ingress as:
ingress.gcp.kubernetes.io/pre-shared-cert: wildcard-abc-org, mcrt-780449c3-edc2-4989

Expected behavior:

Users that connect to app3.abc.org have no SNI match, thus get shown the primary certificate (first in the list)

Actual behavior:

Under the surface the pre-shared-cert annotation gets alphabetically rewritten to:

ingress.gcp.kubernetes.io/pre-shared-cert: mcrt-780449c3-edc2-4989, wildcard-abc-org

Users that connect to app3.abc.org have no SNI match, thus get shown the primary certificate (first in the list). Which is a invalid certificate because it only matches for app.abc.org and app2.abc.org. Which leads to certificate errors.

The only way for us to mitigate this is:

  1. Ensure self managed certificates have an alphabetically earlier name like: a-wildcard-abc-org
  2. Delete the google managed certificate and only show the wildcard certificate during the migration while we wait for the new google managed certificate that includes all domain names to be ready

We'll be going with option 2 for now. But ideally this gets fixed so that it does not get reordered and so the user can actually control which certificate gets shown first.

@k8s-ci-robot
Copy link
Contributor

@rsevat: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
/remove-lifecycle rotten

I can confirm that this is still the case. It gets alphabetically reordered. No matter what order you submit it in. As such my controller was continuously fighting with the controller that kept reordering it. The documentation that describes that the order matters is correct, the first certificate on the list does get shown as primary certificate if there is no SNI match.

Due to this bug the only way we can currently control which certificate gets shown is if we need to use self managed certificates where we can choose what the name is of the certificate and via that we can ensure that it's alphabetically either before or after the google managed certificates.

For example:

We use a single google managed certificate with a random generated name by google: mcrt-780449c3-edc2-4989

With multiple subject alternative names:

app.abc.org
app2.abc.org

Now we want to add app3.abc.org, but that domain is live in our old datacenter. To migrate it to GCP we'll add a wildcard certificate named: wildcard-abc-org to our ingress with as domain name: *.abc.org. So that we have a 0 downtime migration while the new google managed certificate gets created in the background.

We add it to our ingress as:
ingress.gcp.kubernetes.io/pre-shared-cert: wildcard-abc-org, mcrt-780449c3-edc2-4989

Expected behavior:

Users that connect to app3.abc.org have no SNI match, thus get shown the primary certificate (first in the list)

Actual behavior:

Under the surface the pre-shared-cert annotation gets alphabetically rewritten to:

ingress.gcp.kubernetes.io/pre-shared-cert: mcrt-780449c3-edc2-4989, wildcard-abc-org

Users that connect to app3.abc.org have no SNI match, thus get shown the primary certificate (first in the list). Which is a invalid certificate because it only matches for app.abc.org and app2.abc.org. Which leads to certificate errors.

The only way for us to mitigate this is:

  1. Ensure self managed certificates have an alphabetically earlier name like: a-wildcard-abc-org
  2. Delete the google managed certificate and only show the wildcard certificate during the migration while we wait for the new google managed certificate that includes all domain names to be ready

We'll be going with option 2 for now. But ideally this gets fixed so that it does not get reordered and so the user can actually control which certificate gets shown first.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 15, 2023
@hermanbanken
Copy link
Author

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Jun 26, 2023
@k8s-ci-robot
Copy link
Contributor

@hermanbanken: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hermanbanken
Copy link
Author

reopening for @rsevat

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 23, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 22, 2024
@swetharepakula
Copy link
Member

/lifecycle frozen

To keep the issue alive

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Feb 22, 2024
@swetharepakula
Copy link
Member

swetharepakula commented Feb 22, 2024

The Ingress Controller does not write to this annotation. This is actually a consequence of using the Managed Certificates controller which writes to the annotation to add the Google Managed Cert: https://github.com/GoogleCloudPlatform/gke-managed-certs/blob/master/pkg/controller/sync/sync.go#L152-L162.

The managed cert controller has to add the managed cert to the pre-shared-cert annotation for the Ingress controller to add it the SSL cert. When adding the cert, the managed cert controller always sorts the certs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
6 participants