Materializes all rows as Python proto objects &mdash; high memory usage at scale

**Is your feature request related to a problem? Please describe.**
When materializing features to an online store via LocalOutputNode, the current implementation converts the entire Arrow Table into a Python list of ValueProto objects before any writing occurs. At hundreds of thousands of rows, this causes severe memory pressure and can OOM in practice.

**Root Cause**

The call chain in [`LocalOutputNode.execute()`](sdk/python/feast/infra/compute_engines/local/nodes.py#L369):

```python
rows_to_write = _convert_arrow_to_proto(
    input_table, self.feature_view, join_key_to_value_type
)
online_store.online_write_batch(..., data=rows_to_write, ...)
```

`_convert_arrow_to_proto` ([utils.py:325](sdk/python/feast/utils.py#L325)) performs three full-data copies sequentially:

1. **Arrow &rarr; NumPy** (`to_numpy(zero_copy_only=False)`) &mdash; necessary to bridge Arrow nulls to Python type system
2. **NumPy &rarr; `List[ValueProto]`** &mdash; each scalar becomes an independent Python protobuf heap object (~200 bytes overhead per value vs 4&ndash;8 bytes raw)
3. **Column-wise &rarr; row-wise** (`list(zip(...))`) &mdash; full materialization into a Python list


**Describe the solution you'd like**

**Chunk iteration in `LocalOutputNode` (minimal, low-risk):**

```python
BATCH_SIZE = 10_000
for batch in input_table.to_batches(max_chunksize=BATCH_SIZE):
    rows_to_write = _convert_arrow_to_proto(
        batch, self.feature_view, join_key_to_value_type
    )
    online_store.online_write_batch(
        config=context.repo_config,
        table=self.feature_view,
        data=rows_to_write,
        progress=lambda x: None,
    )
    # rows_to_write eligible for GC after each iteration
```

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Materializes all rows as Python proto objects — high memory usage at scale #6160

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Materializes all rows as Python proto objects — high memory usage at scale #6160

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions