Skip to content

feat: Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser#6240

Merged
ntkathole merged 4 commits into
feast-dev:masterfrom
aniketpalu:oidc-rbac-ssl-logging
May 1, 2026
Merged

feat: Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser#6240
ntkathole merged 4 commits into
feast-dev:masterfrom
aniketpalu:oidc-rbac-ssl-logging

Conversation

@aniketpalu
Copy link
Copy Markdown
Contributor

@aniketpalu aniketpalu commented Apr 8, 2026

What this PR does / why we need it:

When `authz: oidc` is configured, the Feast server delegates Kubernetes service account (SA) tokens to a lightweight TokenReview for validation and namespace extraction. This requires the server SA to have `tokenreviews/create` permission. Previously, this RBAC was not provisioned automatically by the operator for OIDC deployments (only for `authz: kubernetes`), requiring manual ClusterRole creation.

Operator: OIDC TokenReview RBAC

The operator now provisions a dedicated feast-oidc-token-review ClusterRole and ClusterRoleBinding when authz: oidc is configured. The ClusterRole contains exactly one rule:

  • authentication.k8s.io/tokenreviews/create

This is the minimum permission needed for the SA token delegation path. No additional RBAC queries (rolebindings, clusterroles, namespaces) are granted, unlike the authz: kubernetes path which needs broader permissions for KubernetesTokenParser.

Cleanup is handled automatically when switching auth types:

  • OIDC to kubernetes: OIDC ClusterRole + ClusterRoleBinding deleted
  • OIDC to no_auth: OIDC ClusterRole + ClusterRoleBinding deleted
  • kubernetes/no_auth to OIDC: OIDC ClusterRole + ClusterRoleBinding created

SDK: SSL Error Logging

When verify_ssl: true is set but the OIDC provider uses self-signed certificates without a configured ca_cert_path, the server fails to reach the JWKS/discovery endpoints. Previously, this produced a generic "Invalid token" log with no indication of the root cause. The token parser now detects SSL errors in the exception chain and logs a clear, actionable message:

OIDC provider SSL certificate verification failed. If using a self-signed certificate,
set verify_ssl: false or provide a CA certificate via ca_cert_path.

This applies to both the discovery endpoint (_validate_token) and the JWKS endpoint (_decode_token) error paths.

Which issue(s) this PR fixes:

Follow up to #6089

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Unit tests
  • Integration tests
  • Manual tests
  • Testing is not required for this change

Misc


Open with Devin

@aniketpalu aniketpalu requested a review from a team as a code owner April 8, 2026 15:11
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Collaborator

@jyejare jyejare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The Kubernetes auth path sets a status condition (AuthorizationReadyType) to indicate RBAC provisioning success/failure. The OIDC path does not. This means operators have no visibility into whether the OIDC RBAC was actually created.

  2. The existing OIDC test (featurestore_controller_oidc_auth_test.go) verifies that the Kubernetes-auth Role/RoleBinding are absent, but it does not verify that the new feast-oidc-token-review ClusterRole and per-instance ClusterRoleBinding are created with the correct rules. It also doesn't test the cleanup path (switching from OIDC to no-auth should delete the CRB).

Comment on lines 400 to 402
func (authz *FeastAuthorization) getLabels() map[string]string {
return map[string]string{
services.NameLabelKey: authz.Handler.FeatureStore.Name,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLabels() stamps the FeatureStore-specific name. But the ClusterRole feast-oidc-token-review is shared across all OIDC FeatureStore instances. The last instance to reconcile overwrites the labels with its own name. This creates misleading audit trails — the ClusterRole appears to belong to one FeatureStore when it actually serves all of them.
Recommendation: Either use instance-independent labels for the shared ClusterRole, or use an aggregated label approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by using instance-independent labels for the shared ClusterRole

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@aniketpalu aniketpalu force-pushed the oidc-rbac-ssl-logging branch from 4da984f to 15c8ec5 Compare April 27, 2026 16:06
Copy link
Copy Markdown
Collaborator

@jyejare jyejare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, small comment.

Comment thread infra/feast-operator/internal/controller/authz/authz.go
@aniketpalu aniketpalu requested a review from jyejare April 30, 2026 08:58
apimeta.RemoveStatusCondition(&authz.Handler.FeatureStore.Status.Conditions, feastKubernetesAuthConditions[metav1.ConditionTrue].Type)

// Clean up cluster-scoped Kubernetes auth CRB (handles Kubernetes→OIDC or Kubernetes→no-auth transitions)
authz.cleanupKubernetesClusterRbac()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want to run this on every OIDC reconcile ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check how DeleteOwnedFeastObj in handler.go already does the same thing - get first, early-return on IsNotFound, then Delete.

@aniketpalu aniketpalu force-pushed the oidc-rbac-ssl-logging branch from 557427c to 57a1824 Compare April 30, 2026 20:49
devin-ai-integration[bot]

This comment was marked as resolved.

aniketpalu added 4 commits May 1, 2026 10:11
When authz: oidc is configured, the operator now provisions a dedicated
feast-oidc-token-review ClusterRole and per-instance ClusterRoleBinding
with tokenreviews/create permission for SA token delegation.
Changes:
- Add OIDC status condition (AuthorizationReadyType) for feature parity
  with Kubernetes auth
- Use instance-independent labels for shared ClusterRole to avoid
  misleading audit trails when multiple FeatureStores use OIDC
- Clean up Kubernetes ClusterRoleBinding when switching auth types
- Add test coverage for OIDC RBAC creation and cleanup

Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
@ntkathole ntkathole force-pushed the oidc-rbac-ssl-logging branch from fa88079 to efc6d87 Compare May 1, 2026 04:41
Copy link
Copy Markdown
Member

@ntkathole ntkathole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ntkathole ntkathole merged commit dca57e8 into feast-dev:master May 1, 2026
21 of 26 checks passed
franciscojavierarceo pushed a commit that referenced this pull request May 4, 2026
# [0.63.0](v0.62.0...v0.63.0) (2026-05-04)

### Bug Fixes

* Add project filter to apply_data_source and delete_data_source (closes [#6206](#6206)) ([#6322](#6322)) ([96562c4](96562c4))
* Add project_id filter to SnowflakeRegistry UPDATE path ([#6243](#6243)) ([6658b71](6658b71)), closes [#6208](#6208) [#6208](#6208)
* Add subprocess timeouts to prevent test_e2e_local hanging on Dask atexit handler ([3de6556](3de6556))
* Ambiguous truth value of array during materialization ([#6259](#6259)) ([d0c8984](d0c8984))
* Auto-detect GCS/S3 registry store when registry is passed as string ([#6260](#6260)) ([7ebcf03](7ebcf03))
* **bigquery:** Prefer query over table in get_table_query_string ([#6360](#6360)) ([77ed779](77ed779)), closes [#6200](#6200)
* correct project_id scoping in get_user_metadata and delete_project ([0c469a7](0c469a7))
* disable Redis RDB persistence in test deployments ([44cd682](44cd682))
* Disable snowflake tests temporarily in CI ([#6356](#6356)) ([31d5a98](31d5a98))
* Filter empty SQL commands at execute_snowflake_statement call sites ([#6249](#6249)) ([92ffbb9](92ffbb9))
* Fix five bugs in milvus online store ([#6275](#6275)) ([212504b](212504b))
* Fix issue with apply feature view ([835cda8](835cda8))
* Fix streaming materialization for exotic sources with lazy UDF pipelines ([c07972d](c07972d))
* Handle missing features gracefully instead of panicking ([7d00b3a](7d00b3a))
* Harden informer cache with label selectors and memory optimizations ([#6242](#6242)) ([3f11356](3f11356))
* **helm:** Avoid nil pointer for metrics.enabled inside podAnnotations ([#6251](#6251)) ([c833f1a](c833f1a))
* Include git in feast server image ([fb03c46](fb03c46))
* Include StreamFeatureView in freshness metric ([#6269](#6269)) ([463f16c](463f16c))
* Pre-create S3A event log dir before SparkContext init ([#6317](#6317)) ([9feca77](9feca77))
* Remote Online Store Type Inference Error with All-NULL Columns ([#6063](#6063)) ([de67bdd](de67bdd))
* Remove selector with kustomize overlay using a JSON 6902 patch ([9107a43](9107a43))
* Resolve multiple bugs in SnowflakeRegistry and Snowflake connection handling ([#6315](#6315)) ([7e66a2e](7e66a2e))
* **spark:** BatchFeatureView with TransformationMode.PYTHON now reads all source columns ([a310eaf](a310eaf))
* **spark:** Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query ([e1b1d2d](e1b1d2d))
* Support pandas mode in feature builder and fix dask column extraction ([863315e](863315e))
* support SQL string as entity_df in RemoteOfflineStore.get_historical_features ([c559889](c559889))
* Wrap LocalOutputNode return value in ArrowTableValue for consist… ([#6286](#6286)) ([a16cd55](a16cd55))

### Features

* Add agent skills and Cursor/Claude rules for Feast development ([312eea3](312eea3))
* Add feature view versioning support to FAISS online store ([b36acb7](b36acb7))
* Add feature view versioning support to Redis and DynamoDB online stores ([#6257](#6257)) ([edf25af](edf25af)), closes [#6164](#6164) [#6163](#6163)
* Add optional 'org' in feature view ([#6288](#6288)) ([#6301](#6301)) ([608b105](608b105))
* Add RaySource, to_ray_dataset first-class method, docs, and tests ([1c98157](1c98157))
* Add TLS support for Go Feature Server ([#6229](#6229)) ([28a58d0](28a58d0))
* Add Vector Search support to MongoDBOnlineStore ([#6344](#6344)) ([c102738](c102738))
* Add versioning support to Milvus online store ([#6330](#6330)) ([3268ced](3268ced))
* Addresses performance issues in the Redis online store ([2e50da0](2e50da0))
* Allow to set gpu for ray ([5580ab4](5580ab4))
* Bump redis-py version cap from <5 to <8 ([#6339](#6339)) ([9538180](9538180))
* Expose feature_server, materialization, and openlineage configuration via FeatureStore CRD ([ec6ecfd](ec6ecfd))
* Make online_write_batch_size configurable in MaterializationConfig ([#6268](#6268)) ([d41becf](d41becf))
* Make udf optional if agg defined ([#5689](#5689)) ([#6328](#6328)) ([f630056](f630056))
* MongoDB offline store ([#6138](#6138)) ([8eebad7](8eebad7))
* Optional input_schema for ODFV ([#6308](#6308)) ([#6312](#6312)) ([f08b4e8](f08b4e8))
* Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser ([#6240](#6240)) ([dca57e8](dca57e8))
* **spark:** Add compute-on-read support for BatchFeatureView in get_… ([#6357](#6357)) ([630d9f8](630d9f8))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants