Jump to: Complete Features | Incomplete Features | Complete Epics | Incomplete Epics | Other Complete | Other Incomplete |
Note: this page shows the Feature-Based Change Log for a release
These features were completed when this image was assembled
Currently the Get started with on-premise host inventory quickstart gets delivered in the Core console. If we are going to keep it here we need to add the MCE or ACM operator as a prerequisite, otherwise it's very confusing.
https://github.com/openshift/enhancements/pull/555
https://github.com/openshift/api/pull/827
The console operator will need to support single-node clusters.
We have a console deployment and downloads deployment. Each will to be updated so that there's only a single replica when high availability mode is disabled in the Infrastructure config. We should also remove the anti-affinity rule in the console deployment that tries to spread console pods across nodes.
The downloads deployment is currently a static manifest. That likely needs to be created by the console operator instead going forward.
Acceptance Criteria:
Bump github.com/openshift/api to pickup changes from openshift/api#827
OCP/Telco Definition of Done
Feature Template descriptions and documentation.
Early customer feedback is that they see SNO as a great solution covering smaller footprint deployment, but are wondering what is the evolution story OpenShift is going to provide where more capacity or high availability are needed in the future.
While migration tooling (moving workload/config to new cluster) could be a mid-term solution, customer desire is not to include extra hardware to be involved in this process.
For Telecommunications Providers, at the Far Edge they intend to start small and then grow. Many of these operators will start with a SNO-based DU deployment as an initial investment, but as DUs evolve, different segments of the radio spectrum are added, various radio hardware is provisioned and features delivered to the Far Edge, the Telecommunication Providers desire the ability for their Far Edge deployments to scale up from 1 node to 2 nodes to n nodes. On the opposite side of the spectrum from SNO is MMIMO where there is a robust cluster and workloads use HPA.
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
This is a ticket meant to track all the all the OCP PRs that are involved in the implementation of the SNO + workers enhancement
In the console-operator repo we need to add `capability.openshift.io/console` annotation to all the manifests that the operator either contains creates on the fly.
Manifests are currently present in /bindata and /manifest directories.
Here is example of the insights-operator change.
Here is the overall enhancement doc.
We need to continue to maintain specific areas within storage, this is to capture that effort and track it across releases.
Goals
Requirements
Requirement | Notes | isMvp? |
---|---|---|
Telemetry | No | |
Certification | No | |
API metrics | No | |
Out of Scope
n/a
Background, and strategic fit
With the expected scale of our customer base, we want to keep load of customer tickets / BZs low
Assumptions
Customer Considerations
Documentation Considerations
Notes
In progress:
High prio:
Unsorted
Traditionally we did these updates as bugfixes, because we did them after the feature freeze (FF). Trying no-feature-freeze in 4.12. We will try to do as much as we can before FF, but we're quite sure something will slip past FF as usual.
Update all CSI sidecars to the latest upstream release.
This includes update of VolumeSnapshot CRDs in https://github.com/openshift/cluster-csi-snapshot-controller-operator/tree/master/assets
Update all OCP and kubernetes libraries in storage operators to the appropriate version for OCP release.
This includes (but is not limited to):
Operators:
On new installations, we should make the StorageClass created by the CSI operator the default one.
However, we shouldn't do that on an upgrade scenario. The main reason is that users might have set a different quota on the CSI driver Storage Class.
Exit criteria:
OpenShift console supports new features and elevated experience for Operator Lifecycle Manager (OLM) Operators and Cluster Operators.
OCP Console improves the controls and visibility for managing vendor-provided software in customers’ infrastructure and making these solutions available for customers' internal users.
To achieve this,
We want to make sure OLM’s and Cluster Operators' new features are exposed in the console so admin console users can benefit from them.
Requirement | Notes | isMvp? |
---|---|---|
OCP console supports the latest OLM APIs and features | This is a requirement for ALL features. | YES |
OCP console improves visibility to Cluster Operators related resources and features. | This is a requirement for ALL features. | YES |
(Optional) Use Cases
<--- Remove this text when creating a Feature in Jira, only for reference --->
* Main success scenarios - high-level user stories
* Alternate flow/scenarios - high-level user stories
* ...
Questions to answer...
How will the user interact with this feature?
Which users will use this and when will they use it?
Is this feature used as part of the current user interface?
Out of Scope
<--- Remove this text when creating a Feature in Jira, only for reference --->
# List of non-requirements or things not included in this feature
# ...
Background, and strategic fit
<--- Remove this text when creating a Feature in Jira, only for reference --->
What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Assumptions
<--- Remove this text when creating a Feature in Jira, only for reference --->
* Are there assumptions being made regarding prerequisites and dependencies?
* Are there assumptions about hardware, software or people resources?
* ...
Customer Considerations
<--- Remove this text when creating a Feature in Jira, only for reference --->
* Are there specific customer environments that need to be considered (such as working with existing h/w and software)?
...
Documentation Considerations
<--- Remove this text when creating a Feature in Jira, only for reference --->
Questions to be addressed:
* What educational or reference material (docs) is required to support this product feature? For users/admins? Other functions (security officers, etc)?
* Does this feature have doc impact?
* New Content, Updates to existing content, Release Note, or No Doc Impact
* If unsure and no Technical Writer is available, please contact Content Strategy.
* What concepts do customers need to understand to be successful in [action]?
* How do we expect customers will use the feature? For what purpose(s)?
* What reference material might a customer want/need to complete [action]?
* Is there source material that can be used as reference for the Technical Writer in writing the content? If yes, please link if available.
* What is the doc impact (New Content, Updates to existing content, or Release Note)?
OpenShift console allows users (cluster admins) to change the state of the “default hub sources” for OperatorHub on the cluster from “enabled” to “disabled” and vice versa through “Global Configuration → OperatorHub” view from the “Cluster Settings” view.
Starting from OpenShift 4.4, the console and OLM provides richer configurations for the ‘CatalogSource’ objects that enable users to create their curated sources for OperatorHub with custom “Display Name”, “URL of Image Registry”, and the “Polling Interval” for updating the custom OperatorHub source.
This epic is about reflecting/exposing the newer capabilities on the ‘OperatorHub’ (Cluster Config view) and the ‘CatalogSource’ list and details views.
1. As an admin user of console, I'd like to:
easily disable/enable the predefined Operator sources for the OperatorHub
so that I can:
control the sources of the Operators my cluster users see on the OperatorHub view.
2. As an admin user of console, I'd like to:
easily understand how to add/edit/view/remove my custom Operator catalogSource for the OperatorHub
so that I can
easier managing (add/edit/view/remove) my custom sources for the OperatorHub
3. As an admin user of console, I'd like to:
easily see the configurations and status of my custom Operator catalogSource for the OperatorHub
so that I can
easier managing (review/edit) the configurations of my custom sources for the OperatorHub
Change the state of the default hub sources for OperatorHub on the cluster from enabled to disabled and vice versa. Add and manage your curated sources for OperatorHub on the Sources tab with custom Display Name, URL of Image Registry, and the Polling Interval for updating your custom OperatorHub source.
URL/k8s/cluster/config.openshift.io~v1~OperatorHub/cluster/sources
See current console screenshots in the attachments for reference:
Improve ‘CatalogSource’ details view (on the “Details” tab)
The Sources tab list view now includes 2 new columns for the Catalog Sources:
Action menu changes:
Feature Overview
PF4 has introduced a bevy of new components to really enhance the user experience, specifically around list views. These new components make it easier than ever for customers to find the data they care about and to take action.
Goals
Requirements
Requirement | Notes | isMvp? |
---|---|---|
Improvements must be applied to all resource list view pages including the search page | This is a requirement for ALL features. | YES |
Search page will be the only page that needs "Saved Searches" | This is a requirement for ALL features. | NO |
All components updated must be PF4 supported | This is a requirement for ALL features. | YES |
Questions to answer...
How will the user interact with this feature?
Which users will use this and when will they use it?
Is this feature used as part of the current user interface?
Out of Scope
<--- Remove this text when creating a Feature in Jira, only for reference --->
# List of non-requirements or things not included in this feature
# ...
Background, and strategic fit
<--- Remove this text when creating a Feature in Jira, only for reference --->
What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Assumptions
<--- Remove this text when creating a Feature in Jira, only for reference --->
* Are there assumptions being made regarding prerequisites and dependencies?
* Are there assumptions about hardware, software or people resources?
* ...
Customer Considerations
<--- Remove this text when creating a Feature in Jira, only for reference --->
* Are there specific customer environments that need to be considered (such as working with existing h/w and software)?
...
Documentation Considerations
<--- Remove this text when creating a Feature in Jira, only for reference --->
Questions to be addressed:
* What educational or reference material (docs) is required to support this product feature? For users/admins? Other functions (security officers, etc)?
* Does this feature have doc impact?
* New Content, Updates to existing content, Release Note, or No Doc Impact
* If unsure and no Technical Writer is available, please contact Content Strategy.
* What concepts do customers need to understand to be successful in [action]?
* How do we expect customers will use the feature? For what purpose(s)?
* What reference material might a customer want/need to complete [action]?
* Is there source material that can be used as reference for the Technical Writer in writing the content? If yes, please link if available.
* What is the doc impact (New Content, Updates to existing content, or Release Note)?
The inventory card needs a couple of visual tweaks:
Attaching the desired design from Michael
Feature Overview
This will be phase 1 of Internationalization of the OpenShift Console.
Phase 1 will include the following:
Phase 1 will not include:
Initial List of Languages to Support
---------- 4.7* ----------
*This will be based on the ability to get all the strings externalized, there is a good chance this gets pushed to 4.8.
---------- Post 4.7 ----------
POC
Goals
Internationalization has become table stakes. OpenShift Console needs to support different languages in each of the major markets. This is key functionality that will help unlock sales in different regions.
Requirements
Requirement | Notes | isMvp? |
---|---|---|
Language Selector | YES | |
Localized Date. + Time | YES | |
Externalization and translation of all client side strings | YES | |
Translation for Chinese and Japanese | YES | |
Process, infra, and testing capabilities put into place | YES | |
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Out of Scope
Assumptions
Customer Considerations
We are rolling this feature in phases, based on customer feedback, there may be no phase 2.
Documentation Considerations
I believe documentation already supports a large language set.
We have too many namespaces if we're loading them upfront. We should consolidate some of the files.
Externalize strings in the User Management nav section (Users, Groups, Service Accounts, Roles, Role Bindings).
Externalize strings in the pages under the console Home nav section.
externalize strings for the home / search page
externalize strings for home / namespace pages
Externalize strings in Home / Explore pages
Externalize strings for Home / Events section
Externalize strings for Home / project / dashboard section
Externalize strings in the Monitoring nav section (Alerting, Metrics, Dashboards).
Externalize strings in the pages under the console Workloads nav section.
i18n for Replication Controllers
i18n for Horizontal Pod AutoScalers
Update i18n files/scripts to support Korean in advance of translation work from Terry.
Specifically:
Check in translations from Sprint 192.
Bump i18next dependencies to the latest versions.
By default, we should use the user's browser preference for language, but we should give users a language selector to override. Here is a proposed design:
https://docs.google.com/document/d/17iIBDlEneu0DNhWi2TkQShRSobQZubWOhS0OQr_T3hE/edit#
We need to translate following common components used in the list and details pages:
Includes list page component, filter toolbar, details page breadcrumbs, resource summary/details item, common details page headings, managed by operator link, and the conditions table.
Externalize strings in the Builds nav section (Build Configs, Builds, Image Streams).
Add zh-cn and ja-jp translations.
Externalize strings in the Compute nav section (Nodes, Machines, Machine Sets, Machine Autoscalers, Machine Health Checks, Machine Configs, Machine Config Pools).
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
Rebase openshift-controller-manager to k8s 1.24
Assumption
Doc: https://docs.google.com/document/d/1sXCaRt3PE0iFmq7ei0Yb1svqzY9bygR5IprjgioRkjc/edit
Customers do not pay Red Hat more to run HyperShift control planes and supporting infrastructure than Standalone control planes and supporting infrastructure.
Assumption
Run cluster-storage-operator (CSO) + AWS EBS CSI driver operator + AWS EBS CSI driver control-plane Pods in the management cluster, run the driver DaemonSet in the hosted cluster.
More information here: https://docs.google.com/document/d/1sXCaRt3PE0iFmq7ei0Yb1svqzY9bygR5IprjgioRkjc/edit
As HyperShift Cluster Instance Admin, I want to run AWS EBS CSI driver operator + control plane of the CSI driver in the management cluster, so the guest cluster runs just my applications.
Exit criteria:
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
Cluster administrators need an in-product experience to discover and install new Red Hat offerings that can add high value to developer workflows.
Requirements | Notes | IS MVP |
Discover new offerings in Home Dashboard | Y | |
Access details outlining value of offerings | Y | |
Access step-by-step guide to install offering | N | |
Allow developers to easily find and use newly installed offerings | Y | |
Support air-gapped clusters | Y |
< What are we making, for who, and why/what problem are we solving?>
Discovering solutions that are not available for installation on cluster
No known dependencies
Background, and strategic fit
None
Quick Starts
Developers using Dev Console need to be made aware of the RH developer tooling available to them.
Provide awareness to developers using Dev Console of the RH developer tooling that is available to them, including:
Consider enhancing the +Add page and/or the Guided tour
Provide a Quick Start for installing the Cryostat Operator
To increase usage of our RH portfolio
This issue is to handle the PR comment - https://github.com/openshift/console-operator/pull/770#pullrequestreview-1501727662 for the issue https://issues.redhat.com/browse/ODC-7292
Epic Goal*
Provide a long term solution to SELinux context labeling in OCP.
Why is this important? (mandatory)
As of today when selinux is enabled, the PV's files are relabeled when attaching the PV to the pod, this can cause timeout when the PVs contains lot of files as well as overloading the storage backend.
https://access.redhat.com/solutions/6221251 provides few workarounds until the proper fix is implemented. Unfortunately these workaround are not perfect and we need a long term seamless optimised solution.
This feature tracks the long term solution where the PV FS will be mounted with the right selinux context thus avoiding to relabel every file.
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
As we are relying on mount context there should not be any relabeling (chcon) because all files / folders will inherit the context from the mount context
More on design & scenarios in the KEP and related epic STOR-1173
Dependencies (internal and external) (mandatory)
None for the core feature
However the driver will have to set SELinuxMountSupported to true in the CSIDriverSpec to enable this feature.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
Support upstream feature "SELinux relabeling using mount options (CSIDriver API change)"" in OCP as Beta, i.e. test it and have docs for it (unless it's Alpha upstream).
Summary: If Pod has defined SELinux context (e.g. it uses "resticted" SCC) and it uses ReadWriteOncePod PVC and CSI driver responsible for the volume supports this feature, kubelet + the CSI driver will mount the volume directly with the correct SELinux labels. Therefore CRI-O does not need to recursive relabel the volume and pod startup can be significantly faster. We will need a thorough documentation for this.
This upstream epic actually will be implemented by us!
As a cluster user, I want to use mounting with SELinux context without any configuration.
This means OCP ships CSIDriver objects with "SELinuxMount: true" for CSI drivers that support mounting with "-o context". I.e. all CSI drivers that are based on block volumes and use ext4/xfs should have this enabled.
This Epic is to track upstream work in the Storage SIG community
This Epic is to track the SELinux specific work required. fsGroup work is not included here.
Goal:
Continue contributing to and help move along the upstream efforts to enable recursive permissions functionality.
Finish current SELinuxMountReadWriteOncePod feature upstream:
The feature is probably going to stay alpha upstream.
Problem:
Recursive permission change takes very long for fsGroup and SELinux. For volumes with many small files Kubernetes currently does a chown for every file on the volume (due to fsGroup). Similarly for container runtimes (such as CRI-O) a chcon of every file on the volume is performed due to SCC's SELinux context. Data on the volume may already have the correct GID/SELinux context so Kubernetes needs way to detect this automatically to avoid the long delay.
Why is this important:
Dependencies (internal and external):
Prioritized epics + deliverables (in scope / not in scope):
Estimate (XS, S, M, L, XL, XXL):
Previous Work:
Customers:
Open questions:
Notes:
As OCP developer (and as OCP user in the future), I want all CSI drivers shipped as part of OCP to support mounting with -o context=XYZ, so I can test with CSIDriver.SELinuxMount: true (or my pods are running without CRI-O recursively relabeling my volume).
In detail:
Exit criteria:
Feature Goal*
What is our purpose in implementing this? What new capability will be available to customers?
The goal of this feature is to provide a consistent, predictable and deterministic approach on how the default storage class(es) is managed.
Why is this important? (mandatory)
The current default storage class implementation has corner cases which can result in PVs staying in pending because there is either no default storage class OR multiple storage classes are defined
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
No default storage class
In some cases there is no default SC defined, this can happen during OCP deployment where components such as the registry request a PV whereas the SC are not been defined yet. This can also happen during a change in default SC, there won't be any between the admin unset the current one and set the new on.
Another user creates PVC requesting a default SC, by leaving pvc.spec.storageClassName=nil. The default SC does not exist at this point, therefore the admission plugin leaves the PVC untouched with pvc.spec.storageClassName=nil.
The admin marks SC2 as default.
PV controller, when reconciling the PVC, updates pvc.spec.storageClassName=nil to the new SC2.
PV controller uses the new SC2 when binding / provisioning the PVC.
The installer creates a default SC.
PV controller, when reconciling the PVC, updates pvc.spec.storageClassName=nil to the new default SC.
PV controller uses the new default SC when binding / provisioning the PVC.
Multiple Storage Classes
In some cases there are multiple default SC, this can be an admin mistake (forget to unset the old one) or during the period where a new default SC is created but the old one is still present.
New behavior:
-> the PVC will get the default storage class with the newest CreationTimestamp (i.e. B) and no error should show.
-> admin will get an alert that there are multiple default storage classes and they should do something about it.
CSI that are shipped as part of OCP
The CSI drivers we ship as part of OCP are deployed and managed by RH operators. These operators automatically create a default storage class. Some customers don't like this approach and prefer to:
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
No external dependencies.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Can bring confusion to customer as there is a change in the default behavior customer are used to. This needs to be carefully documented.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
Create a GCP cloud specific spec.resourceTags entry in the infrastructure CRD. This should create and update tags (or labels in GCP) on any openshift cloud resource that we create and manage. The behaviour should also tag existing resources that do not have the tags yet and once the tags in the infrastructure CRD are changed all the resources should be updated accordingly.
Tag deletes continue to be out of scope, as the customer can still have custom tags applied to the resources that we do not want to delete.
Due to the ongoing intree/out of tree split on the cloud and CSI providers, this should not apply to clusters with intree providers (!= "external").
Once confident we have all components updated, we should introduce an end2end test that makes sure we never create resources that are untagged.
Goals
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
List any affected packages or components.
This epic covers the work to apply user defined labels GCP resources created for openshift cluster available as tech preview.
The user should be able to define GCP labels to be applied on the resources created during cluster creation by the installer and other operators which manages the specific resources. The user will be able to define the required tags/labels in the install-config.yaml while preparing with the user inputs for cluster creation, which will then be made available in the status sub-resource of Infrastructure custom resource which cannot be edited but will be available for user reference and will be used by the in-cluster operators for labeling when the resources are created.
Updating/deleting of labels added during cluster creation or adding new labels as Day-2 operation is out of scope of this epic.
List any affected packages or components.
Reference - https://issues.redhat.com/browse/RFE-2017
cluster-config-operator makes Infrastructure CRD available for installer, which is included in it's container image from the openshift/api package and requires the package to be updated to have the latest CRD.
The storage operators need to be automatically restarted after the certificates are renewed.
From OCP doc "The service CA certificate, which issues the service certificates, is valid for 26 months and is automatically rotated when there is less than 13 months validity left."
Since OCP is now offering an 18 months lifecycle per release, the storage operator pods need to be automatically restarted after the certificates are renewed.
The storage operators will be transparently restarted. The customer benefit should be transparent, it avoids manually restart of the storage operators.
The administrator should not need to restart the storage operator when certificates are renew.
This should apply to all relevant operators with a consistent experience.
As an administrator I want the storage operators to be automatically restarted when certificates are renewed.
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
High-level list of items that are out of scope. Initial completion during Refinement status.
This feature request is triggered by the new extended OCP lifecycle. We are moving from 12 to 18 months support per release.
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
No doc is required
This feature only cover storage but the same behavior should be applied to every relevant components.
The pod `aws-ebs-csi-driver-controller` mounts the secret:
$ oc get po -n openshift-cluster-csi-drivers aws-ebs-csi-driver-controller-559f74d7cd-5tk4p -o yaml ... name: driver-kube-rbac-proxy name: provisioner-kube-rbac-proxy name: attacher-kube-rbac-proxy name: resizer-kube-rbac-proxy name: snapshotter-kube-rbac-proxy volumeMounts: - mountPath: /etc/tls/private name: metrics-serving-cert volumes: - name: metrics-serving-cert secret: defaultMode: 420 secretName: aws-ebs-csi-driver-controller-metrics-serving-cert
Hence, if the secret is updated (e.g. as a result of CA cert update), the Pod must be restarted
Console enhancements based on customer RFEs that improve customer user experience.
Requirement | Notes | isMvp? |
---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
Based on https://issues.redhat.com/browse/RFE-3775 we should be extending our proxy package timeout to match the browser's timeout, which is 5 minutes.
AC: Bump the 30second timeout in the proxy pkg to 5 minutes
Description of problem:
Even though in 4.11 we introduced LegacyServiceAccountTokenNoAutoGeneration to be compatible with upstream K8s to not generate secrets with tokens when service accounts are created, today OpenShift still creates secrets and tokens that are used for legacy usage of openshift-controller as well as the image-pull secrets.
Customer issues:
Customers see auto-generated secrets for service accounts which is flagged as a security risk.
This Feature is to track the implementation for removing legacy usage and image-pull secret generation as well so that NO secrets are auto-generated when a Service Account is created on OpenShift cluster.
NO Secrets to be auto-generated when creating service accounts
Following *secrets need to NOT be generated automatically with every Serivce account creation:*
Use Cases (Optional):
Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
Concerns/Risks: Replacing functionality of one of the openshift-controller used for controllers that's been in the code for a long time may impact behaviors that w
High-level list of items that are out of scope. Initial completion during Refinement status.
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
Existing documentation needs to be clear on where we are today and why we are providing the above 2 credentials. Related Tracker: https://issues.redhat.com/browse/OCPBUGS-13226
Which other projects and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
As an Infrastructure Administrator, I want to deploy OpenShift on Nutanix distributing the control plane and compute nodes across multiple regions and zones, forming different failure domains.
As an Infrastructure Administrator, I want to configure an existing OpenShift cluster to distribute the nodes across regions and zones, forming different failure domains.
Install OpenShift on Nutanix using IPI / UPI in multiple regions and zones.
This implementation would follow the same idea that has been done for vSphere. The following are the main PRs for vSphere:
https://github.com/openshift/enhancements/blob/master/enhancements/installer/vsphere-ipi-zonal.md
Nutanix Zonal: Multiple regions and zones support for Nutanix IPI and Assisted Installer
Note
As a user, I want to be able to spread control plane nodes for an OCP clusters across Prism Elements (zones).
As our customers create more and more clusters, it will become vital for us to help them support their fleet of clusters. Currently, our users have to use a different interface(ACM UI) in order to manage their fleet of clusters. Our goal is to provide our users with a single interface for managing a fleet of clusters to deep diving into a single cluster. This means going to a single URL – your Hub – to interact with your OCP fleet.
The goal of this tech preview update is to improve the experience from the last round of tech preview. The following items will be improved:
Key Objective
Providing our customers with a single simplified User Experience(Hybrid Cloud Console)that is extensible, can run locally or in the cloud, and is capable of managing the fleet to deep diving into a single cluster.
Why customers want this?
Why we want this?
Phase 2 Goal: Productization of the united Console
Description of problem:
There is a possible race condition in the console operator where the managed cluster config gets updated after the console deployment and doesn't trigger a rollout.
Version-Release number of selected component (if applicable):
4.10
How reproducible:
Rarely
Steps to Reproduce:
1. Enable multicluster tech preview by adding TechPreviewNoUpgrade featureSet to FeatureGate config. (NOTE THIS ACTION IS IRREVERSIBLE AND WILL MAKE THE CLUSTER UNUPGRADEABLE AND UNSUPPORTED) 2. Install ACM 2.5+ 3. Import a managed cluster using either the ACM console or the CLI 4. Once that managed cluster is showing in the cluster dropdown, import a second managed cluster
Actual results:
Sometimes the second managed cluster will never show up in the cluster dropdown
Expected results:
The second managed cluster eventually shows up in the cluster dropdown after a page refresh
Additional info:
Migrated from bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2055415
In order for hub cluster console OLM screens to behave as expected in a multicluster environment, we need to gather "copiedCSVsDisabled" flags from managed clusters so that the console backend/frontend can consume this information.
AC:
Allow to configure compute and control plane nodes on across multiple subnets for on-premise IPI deployments. With separating nodes in subnets, also allow using an external load balancer, instead of the built-in (keepalived/haproxy) that the IPI workflow installs, so that the customer can configure their own load balancer with the ingress and API VIPs pointing to nodes in the separate subnets.
I want to install OpenShift with IPI on an on-premise platform (high priority for bare metal and vSphere) and I need to distribute my control plane and compute nodes across multiple subnets.
I want to use IPI automation but I will configure an external load balancer for the API and Ingress VIPs, instead of using the built-in keepalived/haproxy-based load balancer that come with the on-prem platforms.
Customers require using multiple logical availability zones to define their architecture and topology for their datacenter. OpenShift clusters are expected to fit in this architecture for the high availability and disaster recovery plans of their datacenters.
Customers want the benefits of IPI and automated installations (and avoid UPI) and at the same time when they expect high traffic in their workloads they will design their clusters with external load balancers that will have the VIPs of the OpenShift clusters.
Load balancers can distribute incoming traffic across multiple subnets, which is something our built-in load balancers aren't able to do and which represents a big limitation for the topologies customers are designing.
While this is possible with IPI AWS, this isn't available with on-premise platforms installed with IPI (for the control plane nodes specifically), and customers see this as a gap in OpenShift for on-premise platforms.
Epic | Control Plane with Multiple Subnets | Compute with Multiple Subnets | Doesn't need external LB | Built-in LB |
---|---|---|---|---|
✓ | ✓ | ✓ | ✓ | |
✓ | ✓ | ✓ | ✕ | |
✓ | ✓ | ✓ | ✓ | |
✓ | ✓ | ✓ | ✓ | |
✓ | ✓ | ✓ | ||
✓ | ✓ | ✓ | ✕ | |
✓ | ✓ | ✓ | ✕ | |
✓ | ✓ | ✓ | ✓ | |
✕ | ✓ | ✓ | ✓ | |
✕ | ✓ | ✓ | ✓ | |
✕ | ✓ | ✓ | ✓ |
Workers on separate subnets with IPI documentation
We can already deploy compute nodes on separate subnets by preventing the built-in LBs from running on the compute nodes. This is documented for bare metal only for the Remote Worker Nodes use case: https://docs.openshift.com/container-platform/4.11/installing/installing_bare_metal_ipi/ipi-install-installation-workflow.html#configure-network-components-to-run-on-the-control-plane_ipi-install-installation-workflow
This procedure works on vSphere too, albeit no QE CI and not documented.
External load balancer with IPI documentation
As an OpenShift infrastructure owner I need to deploy OCP on OpenStack with the installer-provisioned infrastructure workflow and configure my own load balancers
Customers want to use their own load balancers and IPI comes with built-in LBs based in keepalived and haproxy.
vsphere has done the work already via https://issues.redhat.com/browse/SPLAT-409
This is needed once the API patch for External LB has merged.
Traditionally we did these updates as bugfixes, because we did them after the feature freeze (FF). Trying no-feature-freeze in 4.12. We will try to do as much as we can before FF, but we're quite sure something will slip past FF as usual.
Update all OCP and kubernetes libraries in storage operators to the appropriate version for OCP release.
This includes (but is not limited to):
Operators:
Update all CSI sidecars to the latest upstream release from https://github.com/orgs/kubernetes-csi/repositories
Corresponding downstream repos have `csi-` prefix, e.g. github.com/openshift/csi-external-attacher.
This includes update of VolumeSnapshot CRDs in cluster-csi-snapshot-controller- operator assets and client API in go.mod. I.e. copy all snapshot CRDs from upstream to the operator assets + go get -u github.com/kubernetes-csi/external-snapshotter/client/v6 in the operator repo.
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
This work will require updates to the core OpenShift API repository to add the new platform type, and then a distribution of this change to all components that use the platform type information. For components that partners might replace, per-component action will need to be taken, with the project team's guidance, to ensure that the component properly handles the "External" platform. These changes will look slightly different for each component.
To integrate these changes more easily into OpenShift, it is possible to take a multi-phase approach which could be spread over a release boundary (eg phase 1 is done in 4.X, phase 2 is done in 4.X+1).
OCPBU-5: Phase 1
OCPBU-510: Phase 2
OCPBU-329: Phase.Next
Phase 1
Phase 2
Phase 3
As a Red Hat Partner installing OpenShift using the External platform type, I would like to install my own Cloud Controller Manager(CCM). Having a field in the Infrastructure configuration object to signal that I will install my own CCM and that Kubernetes should be configured to expect an external CCM will allow me to run my own CCM on new OpenShift deployments.
This work has been defined in the External platform enhancement , and had previously been part of openshift/api . The CCM API pieces were removed for the 4.13 release of OpenShift to ensure that we did not ship unused portions of the API.
In addition to the API changes, library-go will need to have an update to the IsCloudProviderExternal function to detect the if the External platform is selected and if the CCM should be enabled for external mode.
We will also need to check the ObserveCloudVolumePlugin function to ensure that it is not affected by the external changes and that it continues to use the external volume plugin.
After updating openshift/library-go, it will need to be re-vendored into the MCO , KCMO , and CCCMO (although this is not as critical as the other 2).
Update ETCD datastore encryption to use AES-GCM instead of AES-CBC
2. What is the nature and description of the request?
The current ETCD datastore encryption solution uses the aes-cbc cipher. This cipher is now considered "weak" and is susceptible to padding oracle attack. Upstream recommends using the AES-GCM cipher. AES-GCM will require automation to rotate secrets for every 200k writes.
The cipher used is hard coded.
3. Why is this needed? (List the business requirements here).
Security conscious customers will not accept the presence and use of weak ciphers in an OpenShift cluster. Continuing to use the AES-CBC cipher will create friction in sales and, for existing customers, may result in OpenShift being blocked from being deployed in production.
4. List any affected packages or components.
Epic Goal*
What is our purpose in implementing this? What new capability will be available to customers?
The Kube APIserver is used to set the encryption of data stored in etcd. See https://docs.openshift.com/container-platform/4.11/security/encrypting-etcd.html
Today with OpenShift 4.11 or earlier, only aescbc is allowed as the encryption field type.
RFE-3095 is asking that aesgcm (which is an updated and more recent type) be supported. Furthermore RFE-3338 is asking for more customizability which brings us to how we have implemented cipher customzation with tlsSecurityProfile. See https://docs.openshift.com/container-platform/4.11/security/tls-security-profiles.html
Why is this important? (mandatory)
AES-CBC is considered as a weak cipher
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
The new aesgcm encryption provider was added in 4.13 as techpreview, but as part of https://issues.redhat.com/browse/API-1509, the feature needs to be GA in OCP 4.13.
Console operator should be building up a set of cluster nodes OS types, which he should supply to console, so it renders only operators that could be installed on the cluster.
This will be needed when we will support different OS types on the cluster.
We need to scan through the compute nodes and build a set of supported OS from those. Each node on the cluster has a label for its operating system: e.g. kubernetes.io/os=linux,
AC:
Key Objective
Providing our customers with a single simplified User Experience(Hybrid Cloud Console)that is extensible, can run locally or in the cloud, and is capable of managing the fleet to deep diving into a single cluster.
Why customers want this?
Why we want this?
Phase 2 Goal: Productization of the united Console
We need a way to show metrics for workloads running on spoke clusters. This depends on ACM-876, which lets the console discover the monitoring endpoints.
Open Issues:
We will depend on ACM to create a route on each spoke cluster for the prometheus tenancy service, which is required for metrics for normal users.
Openshift console backend should proxy managed cluster monitoring requests through the MCE cluster proxy addon to prometheus services on the managed cluster. This depends on https://issues.redhat.com/browse/ACM-1188
Pre-Work Objectives
Since some of our requirements from the ACM team will not be available for the 4.12 timeframe, the team should work on anything we can get done in the scope of the console repo so that when the required items are available in 4.13, we can be more nimble in delivering GA content for the Unified Console Epic.
Overall GA Key Objective
Providing our customers with a single simplified User Experience(Hybrid Cloud Console)that is extensible, can run locally or in the cloud, and is capable of managing the fleet to deep diving into a single cluster.
Why customers want this?
Why we want this?
Phase 2 Goal: Productization of the united Console
As a developer I would like to disable clusters like *KS that we can't support for multi-cluster (for instance because we can't authenticate). The ManagedCluster resource has a vendor label that we can use to know if the cluster is supported.
cc Ali Mobrem Sho Weimer Jakub Hadvig
UPDATE: 9/20/22 : we want an allow-list with OpenShift, ROSA, ARO, ROKS, and OpenShiftDedicated
Acceptance criteria:
Description/Acceptance Criteria:
1. Proposed title of this feature request
BYOK encrypts root vols AND default storageclass
2. What is the nature and description of the request?
User story
As a customer spinning up managed OpenShift clusters, if I pass a custom AWS KMS key to the installer, I expect it (installer and cluster-storage-operator) to not only encrypt the root volumes for the nodes in the cluster, but also be applied to encrypt the first/default (gp2 in current case) StorageClass, so that my assumptions around passing a custom key are met.
In current state, if I pass a KMS key to the installer, only root volumes are encrypted with it, and the default AWS managed key is used for the default StorageClass.
Perhaps this could be offered as a flag to set in the installer to further pass the key to the storage class, or not.
3. Why does the customer need this? (List the business requirements here)
To satisfy that customers wish to encrypt their owned volumes with their selected key instead of the AWS default account key, by accident.
4. List any affected packages or components.
Note: this implementation should take effect on AWS, GCP and Azure (any cloud provider) equally.
As a cluster admin, I want OCP to provision new volumes with my custom encryption key that I specified during cluster installation in install-config.yaml so all OCP assets (PVs, VMs & their root disks) use the same encryption key.
Description of criteria:
Re-encryption of existing PVs with a new key. Only newly provisioned PVs will use the new key.
Enhancement (incl. TBD API with encryption key reference) will be provided as part of https://issues.redhat.com/browse/CORS-2080.
"Raw meat" of this story is translation of the key reference in TBD API to StorageClass.Parameters. AWS EBS CSi driver operator should update both the StorageClass it manages (managed-csi) with:
Parameters:
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-east-1:012345678910:key/abcd1234-a123-456a-a12b-a123b4cd56ef"
Upstream docs: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/docs/parameters.md
More details at ARO managed identity scope and impact.
This Section: A list of specific needs or objectives that a Feature must deliver to satisfy the Feature.. Some requirements will be flagged as MVP. If an MVP gets shifted, the feature shifts. If a non MVP requirement slips, it does not shift the feature.
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
This document describes the expectations for promoting a feature that is behind a feature gate.
The criteria includes:
Console enhancements based on customer RFEs that improve customer user experience.
Requirement | Notes | isMvp? |
---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
According to security best practice, it's recommended to set readOnlyRootFilesystem: true for all containers running on kubernetes. Given that openshift-console does not set that explicitly, it's requested that this is being evaluated and if possible set to readOnlyRootFilesystem: true or otherwise to readOnlyRootFilesystem: false with a potential explanation why the file-system needs to be write-able.
3. Why does the customer need this? (List the business requirements here)
Extensive security audits are run on OpenShift Container Platform 4 and are highlighting that many vendor specific container is missing to set readOnlyRootFilesystem: true or else justify why readOnlyRootFilesystem: false is set.
AC: Set up readOnlyRootFilesystem field on both console and console-operator deployment's spec. Part of the work is to determine the value. True if the pod if not doing any writing to its filesystem, otherwise false.
Unify and update hosted control planes storage operators so that they have similar code patterns and can run properly in both standalone OCP and HyperShift's control plane.
Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
High-level list of items that are out of scope. Initial completion during Refinement status.
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
Provide information that needs to be considered and planned so that documentation will meet customer needs. Initial completion during Refinement status.
Which other projects and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
Epic Goal*
Our current design of EBS driver operator to support Hypershift does not scale well to other drivers. Existing design will lead to more code duplication between driver operators and possibility of errors.
Why is this important? (mandatory)
An improved design will allow more storage drivers and their operators to be added to hypershift without requiring significant changes in the code internals.
Scenarios (mandatory)
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
Until final structure is finalized we should be able to build and test aws-ebs image from legacy folder.
This feature is the place holder for all epics related to technical debt associated with Console team
Outcome Overview
Once all Features and/or Initiatives in this Outcome are complete, what tangible, incremental, and (ideally) measurable movement will be made toward the company's Strategic Goal(s)?
Success Criteria
What is the success criteria for this strategic outcome? Avoid listing Features or Initiatives and instead describe "what must be true" for the outcome to be considered delivered.
Expected Results (what, how, when)
What incremental impact do you expect to create toward the company's Strategic Goals by delivering this outcome? (possible examples: unblocking sales, shifts in product metrics, etc. + provide links to metrics that will be used post-completion for review & pivot decisions). {}For each expected result, list what you will measure and when you will measure it (ex. provide links to existing information or metrics that will be used post-completion for review and specify when you will review the measurement such as 60 days after the work is complete)
Post Completion Review – Actual Results
After completing the work (as determined by the "when" in Expected Results above), list the actual results observed / measured during Post Completion review(s).
An epic we can duplicate for each release to ensure we have a place to catch things we ought to be doing regularly but can tend to fall by the wayside.
Added client certificates based on https://github.com/deads2k/openshift-enhancements/blob/master/enhancements/monitoring/client-cert-scraping.md
Upstream Kuberenetes is following other SIGs by moving it's intree cloud providers to an out of tree plugin format, Cloud Controller Manager, at some point in a future Kubernetes release. OpenShift needs to be ready to action this change
GA of the cloud controller manager for the GCP platform
A list of specific needs or objectives that a feature must deliver in order to be considered complete. Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc. Initial completion during Refinement status.
Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
High-level list of items that are out of scope. Initial completion during Refinement status.
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
Provide information that needs to be considered and planned so that documentation will meet customer needs. Initial completion during Refinement status.
Which other projects and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
To make the CCM GA, we need to update the switch case in library go to make sure the GCP CCM is always considered external.
We then need to update the vendor in KCMO, CCMO, KASO and MCO.
Create a Azure cloud specific spec.resourceTags entry in the infrastructure CRD. This should create and update tags (or labels in Azure) on any openshift cloud resource that we create and manage. The behaviour should also tag existing resources that do not have the tags yet and once the tags in the infrastructure CRD are changed all the resources should be updated accordingly.
Tag deletes continue to be out of scope, as the customer can still have custom tags applied to the resources that we do not want to delete.
Due to the ongoing intree/out of tree split on the cloud and CSI providers, this should not apply to clusters with intree providers (!= "external").
Once confident we have all components updated, we should introduce an end2end test that makes sure we never create resources that are untagged.
Goals
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
List any affected packages or components.
This epic covers the work to apply user defined tags to Azure created for openshift cluster available as tech preview.
The user should be able to define the azure tags to be applied on the resources created during cluster creation by the installer and other operators which manages the specific resources. The user will be able to define the required tags in the install-config.yaml while preparing with the user inputs for cluster creation, which will then be made available in the status sub-resource of Infrastructure custom resource which cannot be edited but will be available for user reference and will be used by the in-cluster operators for tagging when the resources are created.
Updating/deleting of tags added during cluster creation or adding new tags as Day-2 operation is out of scope of this epic.
List any affected packages or components.
Reference - https://issues.redhat.com/browse/RFE-2017
cluster-config-operator makes Infrastructure CRD available for installer, which is included in it's container image from the openshift/api package and requires the package to be updated to have the latest CRD.
Epic Goal*
What is our purpose in implementing this? What new capability will be available to customers?
Why is this important? (mandatory)
What are the benefits to the customer or Red Hat? Does it improve security, performance, supportability, etc? Why is work a priority?
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
Traditionally we did these updates as bugfixes, because we did them after the feature freeze (FF). Trying no-feature-freeze in 4.12. We will try to do as much as we can before FF, but we're quite sure something will slip past FF as usual.
Update all OCP and kubernetes libraries in storage operators to the appropriate version for OCP release.
This includes (but is not limited to):
Operators:
EOL, do not upgrade:
Update the driver to the latest upstream release. Notify QE and docs with any new features and important bugfixes that need testing or documentation.
(Using separate cards for each driver because these updates can be more complicated)
Update all CSI sidecars to the latest upstream release from https://github.com/orgs/kubernetes-csi/repositories
Corresponding downstream repos have `csi-` prefix, e.g. github.com/openshift/csi-external-attacher.
This includes update of VolumeSnapshot CRDs in cluster-csi-snapshot-controller- operator assets and client API in go.mod. I.e. copy all snapshot CRDs from upstream to the operator assets + go get -u github.com/kubernetes-csi/external-snapshotter/client/v6 in the operator repo.
An elevator pitch (value statement) that describes the Feature in a clear, concise way. Complete during New status.
The observable functionality that the user now has as a result of receiving this feature. Complete during New status.
A list of specific needs or objectives that a feature must deliver in order to be considered complete. Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc. Initial completion during Refinement status.
Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
High-level list of items that are out of scope. Initial completion during Refinement status.
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
Provide information that needs to be considered and planned so that documentation will meet customer needs. Initial completion during Refinement status.
Which other projects and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
As a developer of serverless functions, we don't provide any samples.
Provide Serverless Function samples in the sample catalog. These would be utilizing the Builder Image capabilities.
As an operator author, I want to provide additional samples that are tied to an operator version, not an OpenShift release. For that, I want to create a resource to add new samples to the web console.
The OpenShift API needs to be updated to define VSphereFailureDomain. A draft PR is here: https://github.com/openshift/api/pull/1539
Also, ensure that the client-go and openshift-cluster-config-operator projects are bumped once the API changes merge.
Epic Goal*
What is our purpose in implementing this? What new capability will be available to customers?
Why is this important? (mandatory)
What are the benefits to the customer or Red Hat? Does it improve security, performance, supportability, etc? Why is work a priority?
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
Traditionally we did these updates as bugfixes, because we did them after the feature freeze (FF).
Update all CSI sidecars to the latest upstream release from https://github.com/orgs/kubernetes-csi/repositories
Corresponding downstream repos have `csi-` prefix, e.g. github.com/openshift/csi-external-attacher.
This includes update of VolumeSnapshot CRDs in cluster-csi-snapshot-controller- operator assets and client API in go.mod. I.e. copy all snapshot CRDs from upstream to the operator assets + go get -u github.com/kubernetes-csi/external-snapshotter/client/v6 in the operator repo.
Cluster administrators need an in-product experience to discover and install new Red Hat offerings that can add high value to developer workflows.
Requirements | Notes | IS MVP |
Discover new offerings in Home Dashboard | Y | |
Access details outlining value of offerings | Y | |
Access step-by-step guide to install offering | N | |
Allow developers to easily find and use newly installed offerings | Y | |
Support air-gapped clusters | Y |
< What are we making, for who, and why/what problem are we solving?>
Discovering solutions that are not available for installation on cluster
No known dependencies
Background, and strategic fit
None
Quick Starts
Cluster admins need to be guided to install RHDH on the cluster.
Enable admins to discover RHDH, be guided to installing it on the cluster, and verifying its configuration.
RHDH is a key multi-cluster offering for developers. This will enable customers to self-discover and install RHDH.
RHDH operator
Description of problem:
The OpenShift Console QuickStarts that promotes RHDH was written in generic terms and doesn't include some information on how to use the CRD-based installation.
We have removed this specific information because the operator wasn't ready at that time. As soon as the RHDH operator is available in the OperatorHub we should update the QuickStarts with some more detailed information.
With a simple CR example and some info on how to customize the base URL or colors.
Version-Release number of selected component (if applicable):
4.15
How reproducible:
Always
Steps to Reproduce:
Just navigate to Quick starts and select the "Install Red Hat Developer Hub (RHDH) with an Operator" quick starts
Actual results:
The RHDH Operator Quick start exists but is written in a generic way.
Expected results:
The RHDH Operator Quick start should contain some more specific information.
Additional info:
Initial PR: https://github.com/openshift/console-operator/pull/806
Description of problem:
The OpenShift Console QuickStarts promotes RHDH but also includes Janus IDP information.
The Janus IDP quick starts should be removed and all information about Janus IDP should be removed.
Version-Release number of selected component (if applicable):
4.15
How reproducible:
Always
Steps to Reproduce:
Just navigate to Quick starts and select the "Install Red Hat Developer Hub (RHDH) with an Operator" quick starts
Actual results:
Expected results:
Additional info:
Initial PR: https://github.com/openshift/console-operator/pull/806
The MCO should properly report its state in a way that's consistent and able to be understood by customers, troubleshooters, and maintainers alike.
Some customer cases have revealed scenarios where the MCO state reporting is misleading and therefore could be unreliable to base decisions and automation on.
In addition to correcting some incorrect states, the MCO will be enhanced for a more granular view of update rollouts across machines.
The MCO should properly report its state in a way that's consistent and able to be understood by customers, troubleshooters, and maintainers alike.
For this epic, "state" means "what is the MCO doing?" – so the goal here is to try to make sure that it's always known what the MCO is doing.
This includes:
While this probably crosses a little bit into the "status" portion of certain MCO objects, as some state is definitely recorded there, this probably shouldn't turn into a "better status reporting" epic. I'm interpreting "status" to mean "how is it going" so status is maybe a "detail attached to a state".
Exploration here: https://docs.google.com/document/d/1j6Qea98aVP12kzmPbR_3Y-3-meJQBf0_K6HxZOkzbNk/edit?usp=sharing
https://docs.google.com/document/d/17qYml7CETIaDmcEO-6OGQGNO0d7HtfyU7W4OMA6kTeM/edit?usp=sharing
Ensure that the pod exists but the functionality behind the pod is not exposed by default in the release version this work ships in.
This can be done by creating a new featuregate in openshift/api, vendoring that into the cluster config operator, and then checking for this featuregate in the state controller code of the MCO.
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
Bump the following libraries in an order with the latest kube and the dependent libraries
Prev Ref:
https://github.com/openshift/api/pull/1534
https://github.com/openshift/client-go/pull/250
https://github.com/openshift/library-go/pull/1557
https://github.com/openshift/apiserver-library-go/pull/118
Extend the Workload Partitioning feature to support multi-node clusters.
Customers running RAN workloads on C-RAN Hubs (i.e. multi-node clusters) that want to maximize the cores available to the workloads (DU) should be able to utilize WP to isolate CP processes to reserved cores.
Requirements
A list of specific needs or objectives that a Feature must deliver to satisfy the Feature. Some requirements will be flagged as MVP. If an MVP gets shifted, the feature shifts. If a non MVP requirement slips, it does not shift the feature.
requirement | Notes | isMvp? |
< How will the user interact with this feature? >
< Which users will use this and when will they use it? >
< Is this feature used as part of current user interface? >
< What does the person writing code, testing, documenting need to know? >
< Are there assumptions being made regarding prerequisites and dependencies?>
< Are there assumptions about hardware, software or people resources?>
< Are there specific customer environments that need to be considered (such as working with existing h/w and software)?>
< Are there Upgrade considerations that customers need to account for or that the feature should address on behalf of the customer?>
<Does the Feature introduce data that could be gathered and used for Insights purposes?>
< What educational or reference material (docs) is required to support this product feature? For users/admins? Other functions (security officers, etc)? >
< What does success look like?>
< Does this feature have doc impact? Possible values are: New Content, Updates to existing content, Release Note, or No Doc Impact>
< If unsure and no Technical Writer is available, please contact Content Strategy. If yes, complete the following.>
< Which other products and versions in our portfolio does this feature impact?>
< What interoperability test scenarios should be factored by the layered product(s)?>
Question | Outcome |
Reduce the OpenShift platform and associated RH provided components to a single physical core on Intel Sapphire Rapids platform for vDU deployments on SingleNode OpenShift.
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
Provide a mechanism to tune the platform to use only one physical core. |
Users need to be able to tune different platforms. | YES |
Allow for full zero touch provisioning of a node with the minimal core budget configuration. | Node provisioned with SNO Far Edge provisioning method - i.e. ZTP via RHACM, using DU Profile. | YES |
Platform meets all MVP KPIs | YES |
Questions to be addressed:
When this image was assembled, these features were not yet completed. Therefore, only the Jira Cards included here are part of this release
For users who are using OpenShift but have not yet begun to explore multicluster and we we offer them.
I'm investigating where Learning paths are today and what is required.
As a user I'd like to have learning path for how to get started with Multicluster.
Install MCE
Create multiple clusters
Use HyperShift
Provide access to cluster creation to devs via templates
Scale up to ACM/ACS (OPP?)
Status
https://github.com/patternfly/patternfly-quickstarts/issues/37#issuecomment-1199840223
Goal: Resources provided via the Dynamic Resource Allocation Kubernetes mechanism can be consumed by VMs.
Details: Dynamic Resource Allocation
Come up with a design of how resources provided by Dynamic Resource Allocation can be consumed by KubeVirt VMs.
The Dynamic Resource Allocation (DRA) feature is an alpha API in Kubernetes 1.26, which is the base for OpenShift 4.13.
This feature provides the ability to create ResourceClaim and ResourceClasse to request access to Resources. This is similar to the dynamic provisioning of PersistentVolume via PersistentVolumeClaim and StorageClasse.
NVIDIA has been a lead contributor to the KEP and has already an initial implementation of a DRA driver and plugin, with a nice demo recording. NVIDIA is expecting to have this DRA driver available in CY23 Q3 or Q4, so likely in NVIDIA GPU Operator v23.9, around OpenShift 4.14.
When asked about the availability of MIG-backed vGPU for Kubernetes, NVIDIA said that the timeframe is not decided yet, because it will likely use DRA for the MIG devices creation and their registration with the vGPU host driver. The MIG-base vGPU feature for OpenShift Virtualization will then likely require support of DRA to request vGPU resources for the VMs.
Not having MIG-backed vGPU is a risk for OpenShift Virtualization adoption in GPU use cases, such as virtual workstations for rendering with Windows-only softwares. Customers who want to have a mix of passthrough, time-based vGPU and MIG-backed vGPU will prefer competitors who offer the full range of options. And the certification of NVIDIA solutions like NVIDIA Omniverse will be blocked, despite a great potential to increase the OpenShift consumption, as it uses RTX/A40 GPU for virtual workstations (not certified by NVIDIA on OpenShift Virtualization yet) and A100/H100 for physics simulation, both use cases probably leveraring vGPUs [7]. There's a lot of necessary conditions for that to happen and MIG-backed vGPU support is one of them.
Who | What | Reference |
---|---|---|
DEV | Upstream roadmap issue (or individual upstream PRs) | <link to GitHub Issue> |
DEV | Upstream documentation merged | <link to meaningful PR> |
DEV | gap doc updated | <name sheet and cell> |
DEV | Upgrade consideration | <link to upgrade-related test or design doc> |
DEV | CEE/PX summary presentation | label epic with cee-training and add a <link to your support-facing preso> |
QE | Test plans in Polarion | <link or reference to Polarion> |
QE | Automated tests merged | <link or reference to automated tests> |
DOC | Downstream documentation merged | <link to meaningful PR> |
Part of making https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation available for early adoption.
As a stakeholder aiming to adopt KubeSaw as a Namespace-as-a-Service solution, I want the project to provide streamlined tooling and a clear code-base, ensuring seamless adoption and integration into my clusters.
Efficient adoption of KubeSaw, especially as a Namespace-as-a-Service solution, relies on intuitive tooling and a transparent codebase. Improving these aspects will empower stakeholders to effortlessly integrate KubeSaw into their Kubernetes clusters, ensuring a smooth transition to enhanced namespace management.
As a Stakeholder, I want streamlined setup of the KubeSaw project and fully automated way of upgrading this setup aling with the updates of the installation.
The expected outcome within the market is both growth and retention. The improved tooling and codebase will attract new stakeholders (growth) and enhance the experience for existing users (retention) by providing a straightforward path to adopting KubeSaw's Namespace-as-a-Service features in their clusters.
This epic is to track all the unplanned work related to security incidents, fixing flaky e2e tests, and other urgent and unplanned efforts that may arise during the sprint.
Goal:
As an administrator, I would like to deploy OpenShift 4 clusters to AWS C2S region
Problem:
Customers were able to deploy to AWS C2S region in OCP 3.11, but our global configuration in OCP 4.1 doesn't support this.
Why is this important:
Lifecycle Information:
Previous Work:**
Here are the relevant PRs from OCP 3.11. You can see that these endpoints are not part of the standard SDK (they use an entirely separate SDK). To support these regions the endpoints had to be configured explicitly.
Seth Jennings has put together a highly customized POC.
Dependencies:
Prioritized epics + deliverables (in scope / not in scope):
Related : https://jira.coreos.com/browse/CORS-1271
Estimate (XS, S, M, L, XL, XXL): L
Customers: North America Public Sector and Government Agencies
Open Questions:
Plugin teams need a mechanism to extend the OCP console that is decoupled enough so they can deliver at the cadence of their projects and not be forced in to the OCP Console release timelines.
The OCP Console Dynamic Plugin Framework will enable all our plugin teams to do the following:
Requirement | Notes | isMvp? |
---|---|---|
UI to enable and disable plugins | YES | |
Dynamic Plugin Framework in place | YES | |
Testing Infra up and running | YES | |
Docs and read me for creating and testing Plugins | YES | |
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
Documentation Considerations
Questions to be addressed:
Console static plugins, maintained as part of the frontend monorepo, currently use static extension types (packages/console-plugin-sdk/src/typings) which directly reference various kinds of objects, including React components and arbitrary functions.
To ease the long-term transition from static to dynamic plugins, we should support a use case where an existing static plugin goes through the following stages:
Once a static plugin reaches the "use dynamic extensions only" stage, its maintainers can move it out of the Console monorepo - the plugin becomes dynamic, shipped via the corresponding operator and loaded by Console app at runtime.
Dynamic plugins will need changes to the console operator config to be enabled and disabled. We'll need either a new CRD or an annotation on CSVs for console to discover when plugins are available.
This story tracks any API updates needed to openshift/api and any operator updates needed to wire through dynamic plugins as console config.
See https://github.com/openshift/enhancements/pull/441 for design details.
Requirement | Notes | isMvp? |
---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
When OCP is performing cluster upgrade user should be notified about this fact.
There are two possibilities how to surface the cluster upgrade to the users:
AC:
Note: We need to decide if we want to distinguish this particular notification by a different color? ccing Ali Mobrem
Created from: https://issues.redhat.com/browse/RFE-3024
The work on this story is dependent on following changes:
The console already supports custom routes on the operator config. With the new proposed CustomDomains API introduces a unified way how to stock install custom domains for routes, which both names and serving cert/keys, customers want to customise. From the console perspective those are:
The setup should be done on the Ingress config. There two new fields are introduced:
Console-operator will be only consuming the API and check for any changes. If a custom domain is set for either `console` or `downloads` route in the `openshift-console` namespace, console-operator will read the setup set a custom route accordingly. When a custom route will be setup for any of console's route, the default route wont be deleted, but instead it will updated so it redirects to the custom one. This is done because of two reasons:
Console-operator will still need to support the CustomDomain API that is available on it's config.
Acceptance criteria:
Questions:
Dump openshift/api godep to pickup new CustomDomain API for the Ingress config.
Implement console-operator changes to consume new CustomDomains API, based on the story details.
During master nodes upgrade when nodes are getting drained there's currently no protection from two or more operands going down. If your component is required to be available during upgrade or other voluntary disruptions, please consider deploying PDB to protect your operands.
The effort is tracked in https://issues.redhat.com/browse/WRKLDS-293.
Example:
Acceptance Criteria:
1. Create PDB controller in console-operator for both console and downloads pods
2. Add e2e tests for PDB in single node and multi node cluster
Note: We should consider to backport this to 4.10
Quick Starts are key tool for helping our customer to discover and understand how to take advantage of services that run on top of the OpenShift Platform. This feature will focus on making the Quick Starts extensible. With this Console extension our customer and partners will be able to add in their own Quick Starts to help drive a great user experience.
Enhancement PR: https://github.com/openshift/enhancements/pull/360
Requirement | Notes | isMvp? |
---|---|---|
Define QuickStart CRD | YES | |
Console Operator: Out of the box support for installing Quick Starts for enabling Operators | YES | |
Process( Design\Review), Documentation + Template for providing out of the box quick starts | YES | |
Process, Docs, for enabling Operators to add Quick Starts | YES | |
Migrate existing UI to work with CRD | ||
Move Existing Quick Starts to new CRD | YES | |
Support Internationalization | NO | |
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
Questions to be addressed:
Goal
Provide a dynamic extensible mechanism to add Guided Tours to the OCP Console.
User Stories/Scenarios
As a product owner, I need a mechanism to add guide tours that will help guide my users to enable my service.
As a product owner, I need a mechanism to add guide tours that will help guide my users to consume my service.
As an Operator, I need a mechanism to add guide tours that will help guide my users to consume my service.
+Acceptance Criteria
Dependencies (internal and external)
Previous Work (Optional):
Open questions::
Done Checklist
As user I would like to see a YAMLSample for the new QuickStart CRD when I go and create my own QuickStarts.
We need a README about the submission process for adding quick starts to the console-operator repo.
Enable the OCP Console to send back user analytics to our existing endpoints in console.redhat.com. Please refer to doc for details of what we want to capture in the future:
Collect desired telemetry of user actions within OpenShift console to improve knowledge of user behavior.
OpenShift console should be able to send telemetry to a pre-configured Red Hat proxy that can be forwarded to 3rd party services for analysis.
User analytics should respect the existing telemetry mechanism used to disable data being sent back
Need to update existing documentation with what we user data we track from the OCP Console: https://docs.openshift.com/container-platform/4.14/support/remote_health_monitoring/about-remote-health-monitoring.html
Capture and send desired user analytics from OpenShift console to Red Hat proxy
Red Hat proxy to forward telemetry events to appropriate Segment workspace and Amplitude destination
Use existing setting to opt out of sending telemetry: https://docs.openshift.com/container-platform/4.14/support/remote_health_monitoring/opting-out-of-remote-health-reporting.html#opting-out-remote-health-reporting
Also, allow just disabling user analytics without affecting the rest of telemetry: Add annotation to the Console to disbale just user analytics
Update docs to show this method as well.
We will require a mechanism to store all the segment values
We need to be able to pass back orgID that we receive from the OCM subscription API call
Sending telemetry from OpenShift cluster nodes
Console already has support for sending analytics to segment.io in Dev Sandbox and OSD environments. We should reuse this existing capability, but default to http://console.redhat.com/connections/api for analytics and http://console.redhat.com/connections/cdn to load the JavaScript in other environments. We must continue to allow Dev Sandbox and OSD clusters a way to configure their own segment key, whether telemetry is enabled, segment API host, and other options currently set as annotations on the console operator configuration resource.
Console will need a way to determine the org-id to send with telemetry events. Likely the console operator will need to read this from the cluster pull secret.
Which other projects, including ROSA/OSD/ARO, and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
The console telemetry plugin needs to send data to a new Red Hat ingress point that will then forward it to Segment for analysis.
Goal:
Update console telemetry plugin to send data to the appropriate ingress point.
Ingress point created for console.redhat.com
As an administrator, I want to disable all telemetry on my cluster including UI analytics sent to segment.
We should honor the existing telemetry configuration so that we send no analytics when an admin opts out of telemetry. See the documentation here:
yes this is the official supported way to disable telemetry though we also have a hidden flag in the CMO configmap that CI clusters use to disable telemetry (it depends if you want to push analytics for CI clusters).
CMO configmap is set to
data:
config.yaml: |-
telemeterClient:
enabled: false
the CMO code that reads the cloud.openshift.com token:
https://github.com/openshift/cluster-monitoring-operator/blob/b7e3f50875f2bb1fed912b23fb80a101d3a786c0/pkg/manifests/config.go#L358-L386
Slack discussion https://redhat-internal.slack.com/archives/C0VMT03S5/p1707753976034809
# [1] Check cluster pull secret for cloud.openshift.com creds oc get secret pull-secret -n openshift-config -o json | jq -r '.data.".dockerconfigjson"' | base64 -d | jq -r '.auths."cloud.openshift.com"' # [2] Check cluster monitoring operator config for 'telemeterClient.enabled == false' oc get configmap cluster-monitoring-config -n openshift-monitoring | jq -r '.data."config.yaml"' # [3] Check console operator config telemetry disabled annotation oc get console.operator.openshift.io cluster -o json | jq -r '.metadata.annotations."telemetry.console.openshift.io/DISABLED"'
We want to enable segment analytics by default on all (incl. self-managed) OCP clusters using a known segment API key and the console.redhat.com proxy. We'll still want to allow to honor the segment-related annotations on the console operator config for overriding these values.
Most likely the segment key should be defaulted in the console operator, otherwise we would need a separate console flag for disabling analytics. If the operator provides the key, then the console backend can use the presence of the key to determine when to enable analytics. We can likely change the segment URL and CDN default values directly in the console code, however.
ODC-7517 tracks disabling segment analytics when cluster telemetry is disabled, which is a separate change, but required for this work.
OpenShift UI Telemetry Implementation details
This three keys should have new default values:
Defaults:
stringData: SEGMENT_API_KEY: BnuS1RP39EmLQjP21ko67oDjhbl9zpNU SEGMENT_API_HOST: console.redhat.com/connections/api/v1 SEGMENT_JS_HOST: console.redhat.com/connections/cdn
AC:
API the Console uses:
const apiUrl = `/api/accounts_mgmt/v1/subscriptions?page=1&search=external_cluster_id%3D%27${clusterID}%27`;
Reference: Original Console PR
High Level Feature Details can be found here
Currently, we can get the organization ID from the OCM server by querying subscription and adding the fetchOrganization=true query parameter based on the comment.
We should be passing this ID as SERVER_FLAG.telemetry.ORGANIZATION_ID to the frontend, and as organizationId to Segment.io
Fetching should be done by the console-operator due to its RBAC permissions. Once the Organization_ID is retrieved, console operator should set it on the console-config.yaml, together with other telemetry variables.
AC:
Volume Group Snapshots is a key new Kubernetes storage feature that allows multiple PVs to be grouped together and snapshotted at the same time. This enables customers to takes consistent snapshots of applications that span across multiple PVs.
This is also a key requirement for backup and DR solutions.
https://kubernetes.io/blog/2023/05/08/kubernetes-1-27-volume-group-snapshot-alpha/
https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3476-volume-group-snapshot
Productise the volume group snapshots feature as tech preview have docs, testing as well as a feature gate to enable it in order for customers and partners to test it in advance.
The feature should be graduated beta upstream to become TP in OCP. Tests and CI must pass and a feature gate should allow customers and partners to easily enable it. We should identify all OCP shipped CSI drivers that support this feature and configure them accordingly.
CSI drivers development/support of this feature.
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
Drivers must support this feature and enable it. Partners may need to change their operator and/or doc to support it.
Document how to enable the feature, what this feature does and how to use it. Update the OCP driver's table to include this capability.
Can be leveraged by ODF and OCP virt, especially around backup and DR scenarios.
Epic Goal*
Create an OCP feature gate that allows customers and parterns to VolumeGroupSnapshot feature while the feature is in alpha & beta upstream.
Why is this important? (mandatory)
Volume group snapshot is an important feature for ODF, OCP virt and backup partners. It requires driver support so partners need early access to the feature to confirm their driver works as expected before GA. The same applies to backup partners.
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
Dependencies (internal and external) (mandatory)
This depends on the driver's support, the feature gate will enable it in the drivers that support it (OCP shipped drivers).
The feature gate should
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
By enabling the feature gate partners should be able to use the VolumeGroupSnapshot API. Non OCP shipped drivers may need to be configured.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
Other teams:
If needed this card can be broken down into more cards with sublists, each card assigned to a different assignee.
Epic Goal*
Drive the technical part of the Kubernetes 1.29 upgrade, including rebasing openshift/kubernetes repositiry and coordination across OpenShift organization to get e2e tests green for the OCP release.
Why is this important? (mandatory)
OpenShift 4.17 cannot be released without Kubernetes 1.30
Scenarios (mandatory)
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
PRs:
Note: There is no work pending from OTA team. The Jira tracks the work pending from other teams.
We started the feature with the assumption that CVO has to implement sigstore key verification like we do with gpg keys.
After investigation we found that sigstore key verification is done at node level and there is no CVO work. From that point this feature became a tracking feature for us to help other teams to do "sigstore key verification" tasks . Specifically Node team. The "sigstore key verification" roadmap is here https://docs.google.com/presentation/d/16dDwALKxT4IJm7kbEU4ALlQ4GBJi14OXDNP6_O2F-No/edit#slide=id.g547716335e_0_2075
Add sigstore signatures to core OCP payload and enable verification. Verification is now done via CRIO.
There is no CVO work in this feature and this is a Tech Preview change.
OpenShift Release Engineering can leverage a mature signing and signature verification stack instead of relying on simple signing
Customers can leverage OpenShift to create trust relationships for running OCP core container images
Specifically, customers can trust signed images from a Red Hat registry and OCP can verify those signatures
A list of specific needs or objectives that a feature must deliver in order to be considered complete. Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc. Initial completion during Refinement status.
<enter general Feature acceptance here>
– Kubelet/CRIO to verify RH images & release payload sigstore signatures
– ART will add sigstore signatures to core OCP images
Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.
These acceptance criteria are for all deployment flavors of OpenShift.
Deployment considerations | List applicable specific needs (N/A = not applicable) | |
Self-managed, managed, or both | both | |
Classic (standalone cluster) | yes | |
Hosted control planes | yes | |
Multi node, Compact (three node), or Single node (SNO), or all | ||
Connected / Restricted Network | ||
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) | ||
Operator compatibility | ||
Backport needed (list applicable versions) | Not Applicable | |
UI need (e.g. OpenShift Console, dynamic plugin, OCM) | none, | |
Other (please specify) |
Add documentation for sigstore verification and gpg verification
For folks mirroring release images (e.g. disconnected/restricted-network):
OCP clusters need to add the ability to validate Sigstore signatures for OpenShift release images.
This is part of Red Hat's overall Sigstore strategy.
Today, Red Hat uses "simple signing" which uses an OpenPGP/GPG key and a separate file server to host signatures for container images.
Cosign is on track to be an industry standard container signing technique. The main difference is that, instead of signatures being stored in a separate file server, the signature is stored in the same registry that hosts the image.
Design document / discussion from software production: https://docs.google.com/document/d/1EPCHL0cLFunBYBzjBPcaYd-zuox1ftXM04aO6dZJvIE/edit
Demo video: https://drive.google.com/file/d/1bpccVLcVg5YgoWnolQxPu8gXSxoNpUuQ/view
Software production will be migrating to the cosign over the course of 2024.
ART will continue to sign using simple signing in combination with sigstore signatures until SP stops using it and product documentation exists to help customers migrate from the simple signing signature verification.
Currently this epic is primarily supporting the Node implementation work in OCPNODE-2231. There's a minor CVO UX tweak planned in OTA-1307 that's definitely OTA work. There's also the enhancement proposal in OTA-1294 and the cluster-update-keys in OTA-1304, which Trevor happens to be doing for intertial reasons, but which he's happy to hand off to OCPNODE and/or shift under OCPNODE-2231.
As described in the OTA-1294 enhancement. The cluster-update-keys repository isn't actually managed by the OTA team, but I expect it will be me opening the pull, and there isn't a dedicated Jira project covering cluster-update-keys, so I'm creating this ticket under the OTA Epic just because I can't think of a better place to put it.
seeing various MCO issue across different platforms and repos on techpreview
@David Joshy found that it might be https://github.com/openshift/cluster-update-keys/pull/58
@jerzhang found diffs in sigstore-registries and policy.json. Which may be coming from this manifest. Is this available during bootstrap?
Goal:
Update team owned repositories to Kubernetes v1.31
?? is the 1.31 freeze
?? is the 1.31 GA
Problem:<please update links for 1.31>
The following repository must be rebased onto the latest version of Kubernetes:
The following repositories should be rebased onto the latest version of Kubernetes:
Entirely remove dependencies on k/k repository inside oc.
Why is this important:
Epic Goal*
Drive the technical part of the Kubernetes 1.31 upgrade, including rebasing openshift/kubernetes repositiry and coordination across OpenShift organization to get e2e tests green for the OCP release.
Why is this important? (mandatory)
OpenShift 4.18 cannot be released without Kubernetes 1.31
Scenarios (mandatory)
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
PRs:
Retro: Kube 1.31 Rebase Retrospective Timeline (OCP 4.18)
Retro recording: https://drive.google.com/file/d/1htU-AglTJjd-VgFfwE3z_dH5tKXT1Tes/view?usp=drive_web
Address performance and scale issues in Whereabouts IPAM CNI
Whereabouts is becoming increasingly more popular for use on workloads that operate at scale. Whereabouts was originally built as a convenience function for a handful of IPs, however, more and more customers want to use whereabouts in scale sitatuions.
Notably, for telco and ai/ml scenarios. Some ai/ml scenarios launch a large number of pods that need to use secondary networks for related traffic.
Upstream collaboration outline
Console enhancements based on customer RFEs that improve customer user experience.
Requirement | Notes | isMvp? |
---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
As a cluster admin I want to set a cluster wide setting for hiding the "Getting started resources" banner from Overview, for all the console users.
AC:
An elevator pitch (value statement) that describes the Feature in a clear, concise way. Complete during New status.
<your text here>
The observable functionality that the user now has as a result of receiving this feature. Include the anticipated primary user type/persona and which existing features, if any, will be expanded. Complete during New status.
<your text here>
A list of specific needs or objectives that a feature must deliver in order to be considered complete. Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc. Initial completion during Refinement status.
<enter general Feature acceptance here>
Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.
Deployment considerations | List applicable specific needs (N/A = not applicable) |
Self-managed, managed, or both | |
Classic (standalone cluster) | |
Hosted control planes | |
Multi node, Compact (three node), or Single node (SNO), or all | |
Connected / Restricted Network | |
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) | |
Operator compatibility | |
Backport needed (list applicable versions) | |
UI need (e.g. OpenShift Console, dynamic plugin, OCM) | |
Other (please specify) |
Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.
<your text here>
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
<your text here>
High-level list of items that are out of scope. Initial completion during Refinement status.
<your text here>
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
<your text here>
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
<your text here>
Provide information that needs to be considered and planned so that documentation will meet customer needs. If the feature extends existing functionality, provide a link to its current documentation. Initial completion during Refinement status.
<your text here>
Which other projects, including ROSA/OSD/ARO, and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
<your text here>
link back to OCPSTRAT-1644 somehow
Epic Goal*
What is our purpose in implementing this? What new capability will be available to customers?
Why is this important? (mandatory)
What are the benefits to the customer or Red Hat? Does it improve security, performance, supportability, etc? Why is work a priority?
Scenarios (mandatory)
Provide details for user scenarios including actions to be performed, platform specifications, and user personas.
Dependencies (internal and external) (mandatory)
What items must be delivered by other teams/groups to enable delivery of this epic.
Contributing Teams(and contacts) (mandatory)
Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.
Acceptance Criteria (optional)
Provide some (testable) examples of how we will know if we have achieved the epic goal.
Drawbacks or Risk (optional)
Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.
Done - Checklist (mandatory)
The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.
The ability in OpenShift to create trust and directly consume access tokens issued by external OIDC Authentication Providers using an authentication approach similar to upstream Kubernetes.
BYO Identity will help facilitate CLI only workflows and capabilities of the Authentication Provider (such as Keycloak, Dex, Azure AD) similar to upstream Kubernetes.
Ability in OpenShift to provide a direct, pluggable Authentication workflow such that the OpenShift/K8s API server can consume access tokens issued by external OIDC identity providers. Kubernetes provides this integration as described here. Customer/Users can then configure their IDPs to support the OIDC protocols and workflows they desire such as Client credential flow.
OpenShift OAuth server is still available as default option, with the ability to tune in the external OIDC provider as a Day-2 configuration.
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
High-level list of items that are out of scope. Initial completion during Refinement status.
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.
Provide information that needs to be considered and planned so that documentation will meet customer needs. Initial completion during Refinement status.
Which other projects and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
Epic Goal
The ability to provide a direct authentication workflow such that OpenShift can consume bearer tokens issued by external OIDC identity providers, replacing the built-in OAuth stack by deactivating/removing its components as necessary.
Why is this important? (mandatory)
OpenShift has its own built-in OAuth server which can be used to obtain OAuth access tokens for authentication to the API. The server can be configured with an external identity provider (including support for OIDC), however it is still the built-in server that issues tokens, and thus authentication is limited to the capabilities of the oauth-server.
Scenarios (mandatory)
Dependencies (internal and external) (mandatory)
Contributing Teams(and contacts) (mandatory)
Acceptance Criteria (optional)
Drawbacks or Risk (optional)
Done - Checklist (mandatory)
Image and artifact signing is a key part of a DevSecOps model. The Red Hat-sponsored sigstore project aims to simplify signing of cloud-native artifacts and sees increasing interest and uptake in the Kubernetes community. This document proposes to incrementally invest in OpenShift support for sigstore-style signed images and be public about it. The goal is to give customers a practical and scalable way to establish content trust. It will strengthen OpenShift’s security philosophy and value-add in the light of the recent supply chain security crisis.
CRIO
https://docs.google.com/document/d/12ttMgYdM6A7-IAPTza59-y2ryVG-UUHt-LYvLw4Xmq8/edit#
This feature will track upstream work from the OpenShift Control Plane teams - API, Auth, etcd, Workloads, and Storage.
To continue and develop meaningful contributions to the upstream community including feature delivery, bug fixes, and leadership contributions.
Note: The matchLabelKeys field is a beta-level field and enabled by default in 1.27. You can disable it by disabling the MatchLabelKeysInPodTopologySpread [feature gate](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/).
Removing from the TP as the feature is enabled by default.
Just a clean up work.
Upstream K8s deprecated PodSecurityPolicy and replaced it with a new built-in admission controller that enforces the Pod Security Standards (See here for the motivations for deprecation).] There is an OpenShift-specific dedicated pod admission system called Security Context Constraints. Our aim is to keep the Security Context Constraints pod admission system while also allowing users to have access to the Kubernetes Pod Security Admission.
With OpenShift 4.11, we are turned on the Pod Security Admission with global "privileged" enforcement. Additionally we set the "restricted" profile for warnings and audit. This configuration made it possible for users to opt-in their namespaces to Pod Security Admission with the per-namespace labels. We also introduced a new mechanism that automatically synchronizes the Pod Security Admission "warn" and "audit" labels.
With OpenShift 4.15, we intend to move the global configuration to enforce the "restricted" pod security profile globally. With this change, the label synchronization mechanism will also switch into a mode where it synchronizes the "enforce" Pod Security Admission label rather than the "audit" and "warn".
Epic Goal
Get Pod Security admission to be run in "restricted" mode globally by default alongside with SCC admission.
When creating a custom SCC, it is possible to assign a priority that is higher than existing SCCs. This means that any SA with access to all SCCs might use the higher priority custom SCC, and this might mutate a workload in an unexpected/unintended way.
To protect platform workloads from such an effect (which, combined with PSa, might result in rejecting the workload once we start enforcing the "restricted" profile) we must pin the required SCC to all workloads in platform namespaces (openshift-, kube-, default).
Each workload should pin the SCC with the least-privilege, except workloads in runlevel 0 namespaces that should pin the "privileged" SCC (SCC admission is not enabled on these namespaces, but we should pin an SCC for tracking purposes).
The following tables track progress.
|
4.18 | 4.17 | 4.16 | 4.15 |
---|---|---|---|---|
monitored | 82 | 82 | 82 | 82 |
fix needed | 69 | 69 | 69 | 69 |
fixed | 38 | 35 | 31 | 39 |
remaining | 31 | 34 | 38 | 30 |
~ remaining non-runlevel | 9 | 12 | 16 | 8 |
~ remaining runlevel (low-prio) | 22 | 22 | 22 | 22 |
~ untested | 2 | 2 | 2 | 82 |
# | namespace | 4.18 | 4.17 | 4.16 | 4.15 |
---|---|---|---|---|---|
1 | oc debug node pods | #1763 | #1816 | #1818 | |
2 | openshift-apiserver-operator | #573 | #581 | ||
3 | openshift-authentication | #656 | #675 | ||
4 | openshift-authentication-operator | #656 | #675 | ||
5 | openshift-catalogd | #50 | #58 | ||
6 | openshift-cloud-credential-operator | #681 | #736 | ||
7 | openshift-cloud-network-config-controller | #2282 | #2490 | #2496 | |
8 | openshift-cluster-csi-drivers | #524 #131 #6 #127 #108 #118 #306 #265 #75 | #170 #459 | #484 | |
9 | openshift-cluster-node-tuning-operator | #968 | #1117 | ||
10 | openshift-cluster-olm-operator | #54 | n/a | ||
11 | openshift-cluster-samples-operator | #535 | #548 | ||
12 | openshift-cluster-storage-operator | #516 | #459 #196 | #484 #211 | |
13 | openshift-cluster-version | #1038 | #1068 | ||
14 | openshift-config-operator | #410 | #420 | ||
15 | openshift-console | #871 | #908 | #924 | |
16 | openshift-console-operator | #871 | #908 | #924 | |
17 | openshift-controller-manager | #336 | #361 | ||
18 | openshift-controller-manager-operator | #336 | #361 | ||
19 | openshift-e2e-loki | #56579 | #56579 | #56579 | #56579 |
20 | openshift-image-registry | #1008 | #1067 | ||
21 | openshift-ingress | #1031 | |||
22 | openshift-ingress-canary | #1031 | |||
23 | openshift-ingress-operator | #1031 | |||
24 | openshift-insights | #1041 | #915 | #967 | |
25 | openshift-kni-infra | #4504 | #4542 | #4539 | #4540 |
26 | openshift-kube-storage-version-migrator | #107 | #112 | ||
27 | openshift-kube-storage-version-migrator-operator | #107 | #112 | ||
28 | openshift-machine-api | #1308 | #407 | #315 #282 #1220 #73 #50 #433 | #332 #326 #1288 #81 #57 #443 |
29 | openshift-machine-config-operator | #4636 | #4219 | #4384 | #4393 |
30 | openshift-manila-csi-driver | #234 | #235 | #236 | |
31 | openshift-marketplace | #578 | #561 | #570 | |
32 | openshift-metallb-system | #238 | #240 | #241 | |
33 | openshift-monitoring | #2498 | #2335 | #2420 | |
34 | openshift-network-console | #2545 | |||
35 | openshift-network-diagnostics | #2282 | #2490 | #2496 | |
36 | openshift-network-node-identity | #2282 | #2490 | #2496 | |
37 | openshift-nutanix-infra | #4504 | #4504 | #4539 | #4540 |
38 | openshift-oauth-apiserver | #656 | #675 | ||
39 | openshift-openstack-infra | #4504 | #4504 | #4539 | #4540 |
40 | openshift-operator-controller | #100 | #120 | ||
41 | openshift-operator-lifecycle-manager | #703 | #828 | ||
42 | openshift-route-controller-manager | #336 | #361 | ||
43 | openshift-service-ca | #235 | #243 | ||
44 | openshift-service-ca-operator | #235 | #243 | ||
45 | openshift-sriov-network-operator | #754 #995 | #999 | #1003 | |
46 | openshift-user-workload-monitoring | #2335 | #2420 | ||
47 | openshift-vsphere-infra | #4504 | #4542 | #4539 | #4540 |
48 | (runlevel) default | ||||
49 | (runlevel) kube-system | ||||
50 | (runlevel) openshift-cloud-controller-manager | ||||
51 | (runlevel) openshift-cloud-controller-manager-operator | ||||
52 | (runlevel) openshift-cluster-api | ||||
53 | (runlevel) openshift-cluster-machine-approver | ||||
54 | (runlevel) openshift-dns | ||||
55 | (runlevel) openshift-dns-operator | ||||
56 | (runlevel) openshift-etcd | ||||
57 | (runlevel) openshift-etcd-operator | ||||
58 | (runlevel) openshift-kube-apiserver | ||||
59 | (runlevel) openshift-kube-apiserver-operator | ||||
60 | (runlevel) openshift-kube-controller-manager | ||||
61 | (runlevel) openshift-kube-controller-manager-operator | ||||
62 | (runlevel) openshift-kube-proxy | ||||
63 | (runlevel) openshift-kube-scheduler | ||||
64 | (runlevel) openshift-kube-scheduler-operator | ||||
65 | (runlevel) openshift-multus | ||||
66 | (runlevel) openshift-network-operator | ||||
67 | (runlevel) openshift-ovn-kubernetes | ||||
68 | (runlevel) openshift-sdn | ||||
69 | (runlevel) openshift-storage |
Implement Migration core for MAPI to CAPI for AWS
When customers use CAPI, There must be no negative effect to switching over to using CAPI . Seamless migration of Machine resources. the fields in MAPI/CAPI should reconcile from both CRDs.
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
When the Machine and MachineSet MAPI resource are non-authoritative, the Machine and MachineSet controllers should observe this condition and should exit, pausing the reconciliation.
When they pause, they should acknowledge this pause by adding a paused condition to the status and ensuring it is set to true.
Support deploying an OpenShift cluster across multiple vSphere clusters, i.e. configuring multiple vCenter servers in one OpenShift cluster.
Multiple vCenter support in the Cloud Provider Interface (CPI) and the Cloud Storage Interface (CSI).
Customers want to deploy OpenShift across multiple vSphere clusters (vCenters) primarily for high availability.
粗文本*h3. *Feature Overview
Support deploying an OpenShift cluster across multiple vSphere clusters, i.e. configuring multiple vCenter servers in one OpenShift cluster.
Multiple vCenter support in the Cloud Provider Interface (CPI) and the Cloud Storage Interface (CSI).
Customers want to deploy OpenShift across multiple vSphere clusters (vCenters) primarily for high availability.
This section contains all the test cases that we need to make sure work as part of the done^3 criteria.
This section contains all scenarios that are considered out of scope for this enhancement that will be done via a separate epic / feature / story.
As an OpenShift administrator, I would like the Cluster Config Operator (CCO) to not block the install of a new cluster due to vSphere Multi vCenter feature gate being enabled so that I can begin to install my cluster across multiple vcenters.
The purpose of this story is to perform the needed changes to get CCO allowing the configuration of the new Feature Gate for vSphere Multi vCenter support. This operator takes the infrastructure config by the installer and updates it for the cluster and applies it. The only change to this operator should be updating the version of openshift/api; however, this operator has a lot of legacy code that is being transitioned to openshift/api that is currently causing issues when updating openshift/api version. We will update the version and address any other modifications as need. We will need to work w/ the API team on what can be removed.
The CCO during installation will need to allow multiple vCenters to be configured. Any other failure reported based on issues performing operator tasks is valid and should be addressed via a new story.
We will need to do the following:
We will need to enhance all logic that has hard coded vCenter size to now look to see if vSphere Multi vCenter feature gate is enabled. If it is enabled, the vCenter count may be larger than 1, else it will still need to fail with the error message of vCenter count may not be greater than 1.
TBD
Known affected components:
[2] https://github.com/openshift/cluster-kube-controller-manager-operator/blob/351c1193c7eebb49054a289a17fc25dfc0e0cd73/bindata/bootkube/manifests/secret-csr-signer-signer.yaml#L10
[3] https://github.com/openshift/cluster-etcd-operator/pull/1234
Create a GCP cloud specific spec.resourceTags entry in the infrastructure CRD. This should create and update tags (or labels in GCP) on any openshift cloud resource that we create and manage. The behaviour should also tag existing resources that do not have the tags yet and once the tags in the infrastructure CRD are changed all the resources should be updated accordingly.
Tag deletes continue to be out of scope, as the customer can still have custom tags applied to the resources that we do not want to delete.
Due to the ongoing intree/out of tree split on the cloud and CSI providers, this should not apply to clusters with intree providers (!= "external").
Once confident we have all components updated, we should introduce an end2end test that makes sure we never create resources that are untagged.
Goals
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
List any affected packages or components.
TechPreview featureSet check added in machine-api-provider-gcp operator for userLabels and userTags.
And the new featureGate added in openshift/api should also be removed.
Acceptance Criteria
As an openshift admin ,who wants to make my OCP more secure and stable . I want to prevent anyone to schedule their workload in master node so that master node only run OCP management related workload .
secure OCP master node by preventing scheduling of customer workload in master node
Anyone applying toleration(s) in a pod spec can unintentionally tolerate master taints which protect master nodes from receiving application workload when master nodes are configured to repel application workload. An admission plugin needs to be configured to protect master nodes from this scenario. Besides the taint/toleration, users can also set spec.NodeName directly, which this plugin should also consider protecting master nodes against.
Needed so we can provide this workflow to customers following the proposal at https://github.com/openshift/enhancements/pull/1583
Reference https://issues.redhat.com/browse/WRKLDS-1015
kube-controller-manager pods are created by code residing in controllers provided by the kube-controler-manager operator. So changes are required in that repo to add a toleration to the node-role.kubernetes.io/control-plane:NoExecute taint.
Console enhancements based on customer RFEs that improve customer user experience.
Requirement | Notes | isMvp? |
---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
This Section:
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Questions to be addressed:
Based on the this old feature request
https://issues.redhat.com/browse/RFE-1530
we do have impersonation in place for gaining access to other user's permissions via the console. But the only documentation we currently have is how to impersonate system:admin via the CLI see
https://docs.openshift.com/container-platform/4.14/authentication/impersonating-system-admin.html
Please provide documentation for the console feature and the required prerequisites for the users/groups accordingly.
AC:
More info on the impersonate access role - https://github.com/openshift/console/pull/13345/files
When the internal oauth-server and oauth-apiserver are removed and replaced with an external OIDC issuer (like azure AD), the console must work for human users of the external OIDC issuer.
An end user can use the openshift console without a notable difference in experience. This must eventually work on both hypershift and standalone, but hypershift is the first priority if it impacts delivery
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
Console needs to be able to auth agains external OIDC IDP. For that console-operator need to set configure it in that order.
AC:
OCP/Telco Definition of Done
Epic Template descriptions and documentation.
<--- Cut-n-Paste the entire contents of this description into your new Epic --->
Provide mechanisms for the builder service account to be made optional in core OpenShift.
< Who benefits from this feature, and how? What is the difference between today’s current state and a world with this feature? >
Requirements | Notes | IS MVP |
Disable service account controller related to Build/BuildConfig when Build capability is disabled | When the API is marked as removed or disabled, stop creating the "builder" service account and its associated RBAC | Yes |
Option to disable the "builder" service account | Even if the Build capability is enabled, allow admins to disable the "builder" service account generation. Admins will need to bring their own service accounts/RBAC for builds to work | Yes |
< What are we making, for who, and why/what problem are we solving?>
<Defines what is not included in this story>
< Link or at least explain any known dependencies. >
Background, and strategic fit
< What does the person writing code, testing, documenting need to know? >
< Are there assumptions being made regarding prerequisites and dependencies?>
< Are there assumptions about hardware, software or people resources?>
< Are there specific customer environments that need to be considered (such as working with existing h/w and software)?>
< What educational or reference material (docs) is required to support this product feature? For users/admins? Other functions (security officers, etc)? >
< Does this feature have doc impact? Possible values are: New Content, Updates to existing content, Release Note, or No Doc Impact?>
< Are there assumptions being made regarding prerequisites and dependencies?>
< Are there assumptions about hardware, software or people resources?>
< If the feature is ordered with other work, state the impact of this feature on the other work>
As a cluster admin trying to disable the Build, DeploymentConfig, and Image Registry capabilities I want the RBAC controllers for the builder and deployer service accounts and default image-registry rolebindings disabled when their respective capability is disabled.
<Describes high level purpose and goal for this story. Answers the questions: Who is impacted, what is it and why do we need it? How does it improve the customer's experience?>
<Describes the context or background related to this story>
In WRKLDS-695, ocm-o was enhanced to disable the Build and DeploymentConfig controllers when the respective capability was disabled. This logic should be extended to include the controllers that set up the service accounts and role bindings for these respective features.
<Defines what is not included in this story>
<Description of the general technical path on how to achieve the goal of the story. Include details like json schema, class definitions>
<Describes what this story depends on. Dependent Stories and EPICs should be linked to the story.>
Dependencies identified
Blockers noted and expected delivery timelines set
Design is implementable
Acceptance criteria agreed upon
Story estimated
Unknown
Verified
Unsatisfied