feat: Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser#6240
Conversation
jyejare
left a comment
There was a problem hiding this comment.
-
The Kubernetes auth path sets a status condition (AuthorizationReadyType) to indicate RBAC provisioning success/failure. The OIDC path does not. This means operators have no visibility into whether the OIDC RBAC was actually created.
-
The existing OIDC test (featurestore_controller_oidc_auth_test.go) verifies that the Kubernetes-auth Role/RoleBinding are absent, but it does not verify that the new feast-oidc-token-review ClusterRole and per-instance ClusterRoleBinding are created with the correct rules. It also doesn't test the cleanup path (switching from OIDC to no-auth should delete the CRB).
| func (authz *FeastAuthorization) getLabels() map[string]string { | ||
| return map[string]string{ | ||
| services.NameLabelKey: authz.Handler.FeatureStore.Name, |
There was a problem hiding this comment.
getLabels() stamps the FeatureStore-specific name. But the ClusterRole feast-oidc-token-review is shared across all OIDC FeatureStore instances. The last instance to reconcile overwrites the labels with its own name. This creates misleading audit trails — the ClusterRole appears to belong to one FeatureStore when it actually serves all of them.
Recommendation: Either use instance-independent labels for the shared ClusterRole, or use an aggregated label approach.
There was a problem hiding this comment.
Addressed by using instance-independent labels for the shared ClusterRole
4da984f to
15c8ec5
Compare
| apimeta.RemoveStatusCondition(&authz.Handler.FeatureStore.Status.Conditions, feastKubernetesAuthConditions[metav1.ConditionTrue].Type) | ||
|
|
||
| // Clean up cluster-scoped Kubernetes auth CRB (handles Kubernetes→OIDC or Kubernetes→no-auth transitions) | ||
| authz.cleanupKubernetesClusterRbac() |
There was a problem hiding this comment.
want to run this on every OIDC reconcile ?
There was a problem hiding this comment.
Check how DeleteOwnedFeastObj in handler.go already does the same thing - get first, early-return on IsNotFound, then Delete.
557427c to
57a1824
Compare
When authz: oidc is configured, the operator now provisions a dedicated feast-oidc-token-review ClusterRole and per-instance ClusterRoleBinding with tokenreviews/create permission for SA token delegation. Changes: - Add OIDC status condition (AuthorizationReadyType) for feature parity with Kubernetes auth - Use instance-independent labels for shared ClusterRole to avoid misleading audit trails when multiple FeatureStores use OIDC - Clean up Kubernetes ClusterRoleBinding when switching auth types - Add test coverage for OIDC RBAC creation and cleanup Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
fa88079 to
efc6d87
Compare
# [0.63.0](v0.62.0...v0.63.0) (2026-05-04) ### Bug Fixes * Add project filter to apply_data_source and delete_data_source (closes [#6206](#6206)) ([#6322](#6322)) ([96562c4](96562c4)) * Add project_id filter to SnowflakeRegistry UPDATE path ([#6243](#6243)) ([6658b71](6658b71)), closes [#6208](#6208) [#6208](#6208) * Add subprocess timeouts to prevent test_e2e_local hanging on Dask atexit handler ([3de6556](3de6556)) * Ambiguous truth value of array during materialization ([#6259](#6259)) ([d0c8984](d0c8984)) * Auto-detect GCS/S3 registry store when registry is passed as string ([#6260](#6260)) ([7ebcf03](7ebcf03)) * **bigquery:** Prefer query over table in get_table_query_string ([#6360](#6360)) ([77ed779](77ed779)), closes [#6200](#6200) * correct project_id scoping in get_user_metadata and delete_project ([0c469a7](0c469a7)) * disable Redis RDB persistence in test deployments ([44cd682](44cd682)) * Disable snowflake tests temporarily in CI ([#6356](#6356)) ([31d5a98](31d5a98)) * Filter empty SQL commands at execute_snowflake_statement call sites ([#6249](#6249)) ([92ffbb9](92ffbb9)) * Fix five bugs in milvus online store ([#6275](#6275)) ([212504b](212504b)) * Fix issue with apply feature view ([835cda8](835cda8)) * Fix streaming materialization for exotic sources with lazy UDF pipelines ([c07972d](c07972d)) * Handle missing features gracefully instead of panicking ([7d00b3a](7d00b3a)) * Harden informer cache with label selectors and memory optimizations ([#6242](#6242)) ([3f11356](3f11356)) * **helm:** Avoid nil pointer for metrics.enabled inside podAnnotations ([#6251](#6251)) ([c833f1a](c833f1a)) * Include git in feast server image ([fb03c46](fb03c46)) * Include StreamFeatureView in freshness metric ([#6269](#6269)) ([463f16c](463f16c)) * Pre-create S3A event log dir before SparkContext init ([#6317](#6317)) ([9feca77](9feca77)) * Remote Online Store Type Inference Error with All-NULL Columns ([#6063](#6063)) ([de67bdd](de67bdd)) * Remove selector with kustomize overlay using a JSON 6902 patch ([9107a43](9107a43)) * Resolve multiple bugs in SnowflakeRegistry and Snowflake connection handling ([#6315](#6315)) ([7e66a2e](7e66a2e)) * **spark:** BatchFeatureView with TransformationMode.PYTHON now reads all source columns ([a310eaf](a310eaf)) * **spark:** Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query ([e1b1d2d](e1b1d2d)) * Support pandas mode in feature builder and fix dask column extraction ([863315e](863315e)) * support SQL string as entity_df in RemoteOfflineStore.get_historical_features ([c559889](c559889)) * Wrap LocalOutputNode return value in ArrowTableValue for consist… ([#6286](#6286)) ([a16cd55](a16cd55)) ### Features * Add agent skills and Cursor/Claude rules for Feast development ([312eea3](312eea3)) * Add feature view versioning support to FAISS online store ([b36acb7](b36acb7)) * Add feature view versioning support to Redis and DynamoDB online stores ([#6257](#6257)) ([edf25af](edf25af)), closes [#6164](#6164) [#6163](#6163) * Add optional 'org' in feature view ([#6288](#6288)) ([#6301](#6301)) ([608b105](608b105)) * Add RaySource, to_ray_dataset first-class method, docs, and tests ([1c98157](1c98157)) * Add TLS support for Go Feature Server ([#6229](#6229)) ([28a58d0](28a58d0)) * Add Vector Search support to MongoDBOnlineStore ([#6344](#6344)) ([c102738](c102738)) * Add versioning support to Milvus online store ([#6330](#6330)) ([3268ced](3268ced)) * Addresses performance issues in the Redis online store ([2e50da0](2e50da0)) * Allow to set gpu for ray ([5580ab4](5580ab4)) * Bump redis-py version cap from <5 to <8 ([#6339](#6339)) ([9538180](9538180)) * Expose feature_server, materialization, and openlineage configuration via FeatureStore CRD ([ec6ecfd](ec6ecfd)) * Make online_write_batch_size configurable in MaterializationConfig ([#6268](#6268)) ([d41becf](d41becf)) * Make udf optional if agg defined ([#5689](#5689)) ([#6328](#6328)) ([f630056](f630056)) * MongoDB offline store ([#6138](#6138)) ([8eebad7](8eebad7)) * Optional input_schema for ODFV ([#6308](#6308)) ([#6312](#6312)) ([f08b4e8](f08b4e8)) * Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser ([#6240](#6240)) ([dca57e8](dca57e8)) * **spark:** Add compute-on-read support for BatchFeatureView in get_… ([#6357](#6357)) ([630d9f8](630d9f8))
What this PR does / why we need it:
When `authz: oidc` is configured, the Feast server delegates Kubernetes service account (SA) tokens to a lightweight TokenReview for validation and namespace extraction. This requires the server SA to have `tokenreviews/create` permission. Previously, this RBAC was not provisioned automatically by the operator for OIDC deployments (only for `authz: kubernetes`), requiring manual ClusterRole creation.Operator: OIDC TokenReview RBAC
The operator now provisions a dedicated
feast-oidc-token-reviewClusterRole and ClusterRoleBinding whenauthz: oidcis configured. The ClusterRole contains exactly one rule:authentication.k8s.io/tokenreviews/createThis is the minimum permission needed for the SA token delegation path. No additional RBAC queries (rolebindings, clusterroles, namespaces) are granted, unlike the
authz: kubernetespath which needs broader permissions forKubernetesTokenParser.Cleanup is handled automatically when switching auth types:
SDK: SSL Error Logging
When
verify_ssl: trueis set but the OIDC provider uses self-signed certificates without a configuredca_cert_path, the server fails to reach the JWKS/discovery endpoints. Previously, this produced a generic "Invalid token" log with no indication of the root cause. The token parser now detects SSL errors in the exception chain and logs a clear, actionable message:This applies to both the discovery endpoint (
_validate_token) and the JWKS endpoint (_decode_token) error paths.Which issue(s) this PR fixes:
Follow up to #6089
Checks
git commit -s)Testing Strategy
Misc