feat(operator): Auto-create KubeRay RBAC for Feast service account#6411
Open
aravind-n wants to merge 1 commit into
Open
feat(operator): Auto-create KubeRay RBAC for Feast service account#6411aravind-n wants to merge 1 commit into
aravind-n wants to merge 1 commit into
Conversation
When the batch-engine ConfigMap selects the Ray engine in KubeRay mode
(type: ray.engine, use_kuberay: true), the Feast service pod uses the
CodeFlare SDK to discover RayCluster resources and read mTLS Secrets.
Previously the Feast SA had no permissions on either, so the SDK calls
returned 403 and users had to apply the Role + RoleBinding by hand
before every materialization run.
This change makes the operator provision them automatically:
- New services/ray_rbac.go reads the batch-engine ConfigMap once per
reconcile and, when KubeRay is selected, CreateOrUpdates a
namespace-scoped Role + RoleBinding named feast-<crName>-kuberay
granting the Feast SA:
ray.io/rayclusters: get, list, watch
core/secrets: get, list, watch, create, update, delete
- Both resources are owner-referenced to the FeatureStore so they
GC with the CR. When use_kuberay flips back to false (or the
batchEngine block is removed), they are deleted on the next
reconcile.
- The operator's own kubebuilder RBAC markers are widened to match
so it can hand those verbs to the Feast SA (k8s RBAC escalation
rules require the granter to hold the granted verbs). config/rbac
and dist/install.yaml are regenerated accordingly.
- New ginkgo suite covers create-on-enable, delete-on-disable, and
no-op when batchEngine is absent.
- 06-batch-and-jobs.md documents the new auto-RBAC behavior so users
know manual setup is no longer required.
Fixes feast-dev#6408
Signed-off-by: Aravind Nidadavolu <[email protected]>
08a3fc9 to
2f10a08
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
When a
FeatureStoreselects the Ray compute engine in KubeRay mode (type: ray.engine,use_kuberay: true), the Feast pod uses the CodeFlare SDK to discover aRayClusterand read mTLS Secrets. Previously the Feast service account had no permissions onray.io/rayclustersorcore/secrets, so every materialization failed with403 Forbiddenand users had to hand-apply a Role + RoleBinding before each deployment.This PR makes the operator provision that RBAC automatically:
internal/controller/services/ray_rbac.godetects KubeRay mode from the batch-engine ConfigMap andCreateOrUpdates a namespace-scoped Role + RoleBinding namedfeast-<crName>-kuberay, owner-referenced to theFeatureStorefor automatic GC. Whenuse_kuberayflips back tofalse(orbatchEngineis removed) the resources are deleted on the next reconcile.ray.io/rayclusters→get, list, watchcore/secrets→get, list, watch, create, update, deleteFeastServices.Deploy()right aftercreateServiceAccount()so the binding subject exists before the Role applies.config/rbac/role.yamlanddist/install.yamlregenerated viamake manifests/make build-installer.use_kuberay: truealready exists in the Ray compute engine config; the operator just learns to act on it.docs/how-to-guides/feast-operator/06-batch-and-jobs.mdso users know manual RBAC setup is no longer required.Which issue(s) this PR fixes:
Fixes #6408
Checks
git commit -s)Testing Strategy
Misc
New ginkgo suite at
internal/controller/featurestore_controller_kuberay_rbac_test.gocovers three reconciler-level flows against an envtest API server:type: ray.engineanduse_kuberay: true.use_kuberayflips tofalse.batchEngineis configured.The operator's full suite (
make fmt vet lint test) passes locally with these changes.