Skip to content

fix: Add project filter to apply_data_source and delete_data_source (closes #6206)#6322

Merged
ntkathole merged 4 commits into
feast-dev:masterfrom
mailtoboggavarapu-coder:patch-1
May 1, 2026
Merged

fix: Add project filter to apply_data_source and delete_data_source (closes #6206)#6322
ntkathole merged 4 commits into
feast-dev:masterfrom
mailtoboggavarapu-coder:patch-1

Conversation

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor

@mailtoboggavarapu-coder mailtoboggavarapu-coder commented Apr 23, 2026

Summary

Fixes cross-project data source contamination in the file registry (issue #6206).

Root cause: apply_data_source only matched on name, not project, so applying a data source in project B could overwrite an existing source with the same name in project A.

Changes:

  • apply_data_source: deduplication now checks both name AND project
  • delete_data_source: deletion now filters by both name AND project

Tests added:

  • test_apply_data_source_cross_project_isolation: verifies project-scoped deduplication
  • test_delete_data_source_project_scoped: verifies project-scoped deletion

Closes #6206

Note: This supersedes PR #6319 (all commits have proper Signed-off-by DCO sign-off)


Open in Devin Review

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor Author

Hi team! Checking in on this PR which fixes the missing project filter in apply_data_source and delete_data_source (closes #6206).

Quick status update:

  • DCO ✅
  • Devin Review ✅ (no issues found)
  • Lint fixed (removed extra blank line flagged by ruff formatter)
  • No conflicts with base branch

Happy to address any feedback or make changes. Thanks for your time!

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor Author

Hi team! 👋 Friendly ping on feast #6322 — this fixes the missing project filter in apply_data_source and delete_data_source (closes #6206).

Status as of today:

  • ✅ DCO passing
  • ✅ Devin Review completed (issues addressed in earlier commits)
  • ✅ Branch synced with master
  • ⏳ 11 workflows awaiting maintainer CI approval (fork policy)

Happy to address any feedback!

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor Author

Hi @franciscojavierarceo — thanks for the approval! 🙏

Just wanted to flag the failing CI check: the only failure is test_e2e_local in unit-test-python (3.12, ubuntu-latest), which fails due to a subprocess timeout during feast teardown in test cleanup. I've re-run the job and it fails with the same timeout again (1 failed, 1606 passed) — confirming this is a pre-existing flaky infrastructure issue unrelated to this PR.

Our changes only touch apply_data_source / delete_data_source filtering logic in sdk/python/feast/infra/offline_stores/file.py and the corresponding test file. The same Python 3.12 test passes on macOS, and 3.10 + 3.11 on Ubuntu both pass — only 3.12 ubuntu hits this teardown timeout.

Would you be able to merge this given that the failure is pre-existing and unrelated to our changes? Happy to provide any additional info if needed.

@franciscojavierarceo
Copy link
Copy Markdown
Member

@mailtoboggavarapu-coder it looks like integration tests are failing for registry related issues from this change

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor Author

Hi @franciscojavierarceo — thanks for flagging the CI failures! I've investigated and fixed the root cause.

Root Cause

The two new test functions (test_apply_data_source_cross_project_isolation and test_delete_data_source_project_scoped) were parametrized with all_fixtures, which includes the session-scoped sqlite_registry fixture.

Since sqlite_registry is declared with @pytest.fixture(scope="session"), it shares a single SQLite database across the entire test session. Our new tests were writing "project_a" and "project_b" data into that shared database, which polluted it for subsequent tests. That's why test_apply_permission_success[sqlite_registry] and test_apply_project_success[sqlite_registry] were failing with assert 3 == 1 — they were seeing leftover data from our tests.

Fix Applied

Changed both new test functions to use [lazy_fixture("local_registry")] instead of all_fixtures. The local_registry fixture is function-scoped (fresh per test), which is correct here since the cross-project isolation behavior only affects file-based registries anyway.

CI Status

Both previously-failing checks now pass:

  • integration-test-registration-local
  • integration-test-registration-ci

DCO Note

One commit (bda17bc2e5) made via the GitHub web editor is missing a Signed-off-by line. A squash-merge on your end would resolve this cleanly. Apologies for the extra noise in the commit history.

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor Author

@franciscojavierarceo — one last blocker: the DCO check is action_required because commit bda17bc2e5 was made via the GitHub web editor and is missing a Signed-off-by line.

The cleanest fix is a squash merge — feast already defaults to "Squash and merge", which collapses all commits into one and bypasses the individual commit DCO issue entirely.

For context on the CI failures:

  • integration-test-registration-local and integration-test-registration-ci both pass (directly relevant to this fix)
  • unit-test-python (3.10, 3.12) and integration-test-python (3.11) are pre-existing flaky failures appearing across unrelated PRs on the same date

Would you be able to squash merge when you have a chance? Really appreciate the review! 🙏

@mailtoboggavarapu-coder
Copy link
Copy Markdown
Contributor Author

@franciscojavierarceoDCO is now fixed!

I rebased the branch to clean up the commit history — the unsigned commit has been replaced with a properly signed one. Here's the current state:

DCO — passing (Required check)
integration-test-registration-local — passing
integration-test-registration-ci — passing
mcp-feature-server-runtime — passing
integration-test-python (3.11) — failing, but this is the same pre-existing flaky infrastructure issue (exit code 2 / teardown timeout) that affects all PRs on this date

The PR is ready to merge whenever you are. Thanks again for the approval! 🙏

…ixes feast-dev#6206)

Filters existing data sources by both name and project in the file
registry implementation, fixing cross-project contamination (issue feast-dev#6206).

Changes:
- apply_data_source: added project-scoped deduplication check
- delete_data_source: added project filter to avoid cross-project deletion

Signed-off-by: Venkateswarlu Boggavarapu <[email protected]>
Regression tests for feast-dev#6206:
- test_apply_data_source_cross_project_isolation: verifies applying a data
  source to project_a does not overwrite the same-named source in project_b
- test_delete_data_source_project_scoped: verifies delete_data_source only
  removes the source from the specified project

Signed-off-by: Venkateswarlu Boggavarapu <[email protected]>
Changed parametrize decorators from all_fixtures to [lazy_fixture("local_registry")] for
both new test functions to avoid polluting the session-scoped sqlite_registry fixture.

Signed-off-by: Venkateswarlu Boggavarapu <[email protected]>
@ntkathole ntkathole merged commit 96562c4 into feast-dev:master May 1, 2026
25 of 26 checks passed
franciscojavierarceo pushed a commit that referenced this pull request May 4, 2026
# [0.63.0](v0.62.0...v0.63.0) (2026-05-04)

### Bug Fixes

* Add project filter to apply_data_source and delete_data_source (closes [#6206](#6206)) ([#6322](#6322)) ([96562c4](96562c4))
* Add project_id filter to SnowflakeRegistry UPDATE path ([#6243](#6243)) ([6658b71](6658b71)), closes [#6208](#6208) [#6208](#6208)
* Add subprocess timeouts to prevent test_e2e_local hanging on Dask atexit handler ([3de6556](3de6556))
* Ambiguous truth value of array during materialization ([#6259](#6259)) ([d0c8984](d0c8984))
* Auto-detect GCS/S3 registry store when registry is passed as string ([#6260](#6260)) ([7ebcf03](7ebcf03))
* **bigquery:** Prefer query over table in get_table_query_string ([#6360](#6360)) ([77ed779](77ed779)), closes [#6200](#6200)
* correct project_id scoping in get_user_metadata and delete_project ([0c469a7](0c469a7))
* disable Redis RDB persistence in test deployments ([44cd682](44cd682))
* Disable snowflake tests temporarily in CI ([#6356](#6356)) ([31d5a98](31d5a98))
* Filter empty SQL commands at execute_snowflake_statement call sites ([#6249](#6249)) ([92ffbb9](92ffbb9))
* Fix five bugs in milvus online store ([#6275](#6275)) ([212504b](212504b))
* Fix issue with apply feature view ([835cda8](835cda8))
* Fix streaming materialization for exotic sources with lazy UDF pipelines ([c07972d](c07972d))
* Handle missing features gracefully instead of panicking ([7d00b3a](7d00b3a))
* Harden informer cache with label selectors and memory optimizations ([#6242](#6242)) ([3f11356](3f11356))
* **helm:** Avoid nil pointer for metrics.enabled inside podAnnotations ([#6251](#6251)) ([c833f1a](c833f1a))
* Include git in feast server image ([fb03c46](fb03c46))
* Include StreamFeatureView in freshness metric ([#6269](#6269)) ([463f16c](463f16c))
* Pre-create S3A event log dir before SparkContext init ([#6317](#6317)) ([9feca77](9feca77))
* Remote Online Store Type Inference Error with All-NULL Columns ([#6063](#6063)) ([de67bdd](de67bdd))
* Remove selector with kustomize overlay using a JSON 6902 patch ([9107a43](9107a43))
* Resolve multiple bugs in SnowflakeRegistry and Snowflake connection handling ([#6315](#6315)) ([7e66a2e](7e66a2e))
* **spark:** BatchFeatureView with TransformationMode.PYTHON now reads all source columns ([a310eaf](a310eaf))
* **spark:** Use SELECT * when feature_name_columns is empty in pull_all_from_table_or_query ([e1b1d2d](e1b1d2d))
* Support pandas mode in feature builder and fix dask column extraction ([863315e](863315e))
* support SQL string as entity_df in RemoteOfflineStore.get_historical_features ([c559889](c559889))
* Wrap LocalOutputNode return value in ArrowTableValue for consist… ([#6286](#6286)) ([a16cd55](a16cd55))

### Features

* Add agent skills and Cursor/Claude rules for Feast development ([312eea3](312eea3))
* Add feature view versioning support to FAISS online store ([b36acb7](b36acb7))
* Add feature view versioning support to Redis and DynamoDB online stores ([#6257](#6257)) ([edf25af](edf25af)), closes [#6164](#6164) [#6163](#6163)
* Add optional 'org' in feature view ([#6288](#6288)) ([#6301](#6301)) ([608b105](608b105))
* Add RaySource, to_ray_dataset first-class method, docs, and tests ([1c98157](1c98157))
* Add TLS support for Go Feature Server ([#6229](#6229)) ([28a58d0](28a58d0))
* Add Vector Search support to MongoDBOnlineStore ([#6344](#6344)) ([c102738](c102738))
* Add versioning support to Milvus online store ([#6330](#6330)) ([3268ced](3268ced))
* Addresses performance issues in the Redis online store ([2e50da0](2e50da0))
* Allow to set gpu for ray ([5580ab4](5580ab4))
* Bump redis-py version cap from <5 to <8 ([#6339](#6339)) ([9538180](9538180))
* Expose feature_server, materialization, and openlineage configuration via FeatureStore CRD ([ec6ecfd](ec6ecfd))
* Make online_write_batch_size configurable in MaterializationConfig ([#6268](#6268)) ([d41becf](d41becf))
* Make udf optional if agg defined ([#5689](#5689)) ([#6328](#6328)) ([f630056](f630056))
* MongoDB offline store ([#6138](#6138)) ([8eebad7](8eebad7))
* Optional input_schema for ODFV ([#6308](#6308)) ([#6312](#6312)) ([f08b4e8](f08b4e8))
* Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser ([#6240](#6240)) ([dca57e8](dca57e8))
* **spark:** Add compute-on-read support for BatchFeatureView in get_… ([#6357](#6357)) ([630d9f8](630d9f8))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apply_data_source method matches by name only, without filtering by project in shared registry scenario

3 participants