Sindika - Building Tomorrow's Digital Infrastructure

The API is slow. Not “a little slow” — the kind of slow where your product manager opens DevTools, stares at a 4-second response, and walks over to your desk with that look.

You profile the endpoint. EF Core is generating SQL that looks like it was written by someone who lost a bet. A single query JOINs Orders to OrderLines to Products to Tags, producing 3,600 rows for what should be 20 orders. Pagination is broken. The database is doing more work than your entire CI pipeline.

You've just met the Cartesian explosion — and if you're using EF Core with complex domain models, it's only a matter of time before it finds you.

“We built a query engine that handles this automatically — detecting collection navigations, choosing the right split strategy, and building recursive filter expressions. This article shares the patterns we discovered, the mistakes we made, and the strategies that actually work in production.”

— Sindika Engineering Team

Whether you're building a SaaS platform, an enterprise ERP, or a data-heavy API, these patterns will transform how you think about EF Core queries. Let's start with the problem that everyone eventually hits.

Chapter 1: The Cartesian Explosion Problem

Consider a typical e-commerce domain: an Order has many OrderLines, each OrderLine has a Product, and each Product has many Tags. A perfectly normal domain model that maps cleanly to your database.

Now write the query everyone writes first:

var orders = await context.Orders
    .Include(o => o.OrderLines)
        .ThenInclude(ol => ol.Product)
            .ThenInclude(p => p.Tags)
    .OrderByDescending(o => o.OrderDate)
    .Take(20)
    .ToListAsync();

This looks innocent. EF Core translates it into a single SQL query with multiple JOINs. The problem? Row multiplication. If Order #1 has 3 OrderLines, and each Product has 4 Tags, that single order produces 3 × 4 = 12 rows in the result set. Multiply that by 20 orders, and you're looking at hundreds — sometimes thousands — of rows for a simple page load.

Each JOIN multiplies rows. What should be 20 records becomes thousands. Your LIMIT 20 now applies to joined rows, not orders.

But the row explosion isn't even the worst part. The LIMIT 20 in your query applies to the joined result set, not to the parent entities. So Take(20) might return only 3 or 4 actual orders instead of 20 — because the 20-row limit was consumed by the exploded rows of the first few orders.

Your pagination is silently broken, and nobody notices until a customer complains that page 2 shows the same items as page 1.

Chapter 2: EF Core's Built-in Answer

EF Core 5.0 introduced .AsSplitQuery() — a deceptively simple method that changes everything. Instead of one massive JOIN, EF Core issues separate SQL queries for each collection navigation:

var orders = await context.Orders
    .Include(o => o.OrderLines)
        .ThenInclude(ol => ol.Product)
            .ThenInclude(p => p.Tags)
    .AsSplitQuery()            // ← The magic line
    .OrderByDescending(o => o.OrderDate)
    .Take(20)
    .ToListAsync();

Split queries solve Cartesian explosion — but each split repeats the full base query with all its WHERE, ORDER BY, and LIMIT clauses.

This fixes pagination. LIMIT 20 now correctly applies to the parent Orders query, then child collections are loaded by matching parent IDs. No more row multiplication.

But — and there's always a “but” — there's a hidden cost.

🤔 The Hidden Tax of Auto Split

▸Repeated base queries — every split query repeats your entire WHERE clause, ORDER BY, and LIMIT/OFFSET. If your filters include complex EXISTS subqueries, they execute N times instead of once.
▸Database plan overhead — the query planner must parse and optimize the same complex query multiple times. With recursive collection filters, each split may contain deeply nested EXISTS clauses.
▸N+1... but intentionally — yes, split queries are technically N+1. The difference is that each query returns a focused result set. But with 5+ collection levels, that's 6+ round trips to the database.
▸Reflection dependency — if you're building a query engine library (like we did), calling .AsSplitQuery() requires a reference to Microsoft.EntityFrameworkCore.Relational. We solved this with reflection.

For most queries, auto split is good enough. But when your domain model has 3+ levels of nesting and your filters traverse collection boundaries with complex EXISTS subqueries, the repeated execution becomes measurable. We saw queries where the database spent 70% of its time re-evaluating the same filter across split requests.

We needed a better approach. One that runs the expensive filter exactly once.

Chapter 3: The ID-First Strategy

What if you could run the expensive filter once, grab just the IDs, and then load everything else with the simplest possible query?

That's exactly what the ID-First strategy does. It splits the work into two phases:

Phase 1: Execute a lightweight query that only selects primary keys. All your complex filters, sorting, and pagination happen here — but the projection is minimal. Just IDs. This query is fast.

Phase 2: Take those IDs and build a fresh query with WHERE Id IN (1, 2, 3, 5, 8...). This query loads the full entity graph with all includes, using the most efficient lookup possible — primary key matching. EF Core's split query mechanism then handles the collection navigations, but each split uses the simple IN clause instead of repeating the complex filter.

Phase 1 runs the filter once to fetch IDs. Phase 2 loads the full graph using fast primary key lookups.

// Phase 1: Lightweight ID projection
var ids = await query
    .Where(complexFilter)        // Expensive EXISTS here
    .OrderByDescending(o => o.OrderDate)
    .Skip(offset).Take(limit)
    .Select(o => (object)o.Id)   // Just the IDs
    .ToListAsync();

// Phase 2: Load by IDs with splits
var results = await query
    .Where(o => ids.Contains(o.Id))  // Simple IN clause
    .Include(o => o.OrderLines)
        .ThenInclude(ol => ol.Product)
    .AsSplitQuery()
    .ToListAsync();

✅ Why ID-First Wins for Complex Queries

✓Filter once, load many — complex EXISTS subqueries execute exactly once during the ID fetch. Child splits use O(1) primary key lookups.
✓Minimal Phase 1 cost — projecting just IDs means no data transfer overhead. 20 integers travel over the wire in microseconds.
✓Database-friendly — WHERE Id IN (...) is the most optimized query pattern in every database engine. Index seeks, not scans.
✓Composable — works with any filter complexity. The heavier your filters, the more you benefit from running them once.
✓Graceful fallback — if the entity doesn't have an Id property, fall back to auto split automatically. No runtime errors.

There's one subtlety worth noting: sort order preservation. TheWHERE Id IN (...) query doesn't guarantee result ordering. Since we already know the correct order from Phase 1 (those IDs came from a sorted, paginated query), we restore the order in memory after materialization — a trivial O(n) operation with a dictionary lookup.

Chapter 4: Filtering Through Collections

Here's where it gets really interesting. Standard EF Core filtering works great for scalar properties: Where(o => o.Status == "Active"). But what if you need to filter orders by a tag that lives three levels deep in a collection navigation?

You want to say: “Give me orders that have at least one order line whose product has a tag named ‘Premium’.”

In raw LINQ, that's a nested .Any() expression:

// What you'd write by hand:
x => x.OrderLines.Any(ol =>
    ol.Product.Tags.Any(t =>
        t.Name == "Premium"
    )
)

// What the engine does automatically from:
.AddFilter("OrderLines.Product.Tags.Name", Eq, "Premium")

The engine detects collection boundaries in the dot-notation path and automatically wraps each collection segment in an .Any() call. This translates to SQL EXISTS subqueries — exactly what the database optimizer wants to see.

The engine walks the navigation path, detects collection boundaries, and builds nested .Any() expressions that translate to efficient SQL EXISTS.

But there's one more twist. When you filter for “orders with Premium tags,” the SQL EXISTS only determines which parent records match. The included collections still return all child items — including non-Premium tags.

That's where FilterChildren comes in. It applies post-projection filtering on the materialized result, recursively walking nested collections and removing items that don't match. The result: what you filter is what you see.

// Filter the parent AND prune the children
.AddFilter(
    "OrderLines.Product.Tags.Name",
    FilterOperator.Eq,
    "Premium",
    filterChildren: true  // ← Prunes non-matching items
)

Chapter 5: The Complete Query Pipeline

Let's zoom out and see how all these patterns fit together. A single API request flows through a 7-step pipeline that handles everything from column parsing to post-projection filtering:

📋

Step 1

Parse Columns

Resolve dot-notation paths to navigation chains

🔗

Step 2

Auto Include

Detect collection navigations and apply .Include()

🔍

Step 3

Build Filters

Generate nested .Any() expressions for collection filters

↕️

Step 4

Apply Sort

Dynamic OrderBy from request parameters

⚡

Step 5

Split Decision

Choose None / Auto / ID-First based on query shape

🚀

Step 6

Execute

Materialize results with correct pagination

✂️

Step 7

Post-Project

Apply FilterChildren on nested collections

The beauty of this pipeline is that each step is independent and composable. Steps 1-4 build the query shape. Step 5 makes an intelligent decision about how to execute it. Step 7 refines the output. A developer using the API never needs to think about any of this — they just write dot-notation column names and filter expressions.

Strategy Comparison at a Glance

Here's how the three split strategies compare across the dimensions that actually matter in production:

Aspect	None (JOIN)	Auto Split	ID-First
SQL Queries	1 (large JOIN)	N separate	✓ 1 + N targeted
Filter Execution	Once	N times (repeated)	✓ Once (ID phase)
Pagination	❌ Broken	✓ ✓ Correct	✓ Correct
Network Trips	✓ 1	N	N + 1
Result Size	Cartesian (huge)	✓ Minimal	Minimal
Complexity	Minimal	✓ Automatic	Moderate
Best For	Simple 1:1	Most queries	✓ Heavy filters

Auto Split is the right default for most applications. ID-First is a precision tool — reach for it when profiling shows that repeated filter execution is your bottleneck. None is fine for simple one-to-one navigations where Cartesian explosion isn't a concern.

Chapter 6: Choosing Your Strategy

We've distilled the decision into two simple questions. No 50-page performance tuning guide. Just two honest checks against your query shape:

If your query doesn't touch any collections — just scalar navigations like Order.Customer.Name — a single JOIN is perfectly fine. No splitting needed.

If you do have collections but your filters are simple (no deep EXISTS nesting), EF Core's built-in auto split handles it beautifully.

But if you have complex collection-level filters — the kind that generate nested EXISTS subqueries — the ID-First strategy gives you measurably better database performance by running that filter exactly once.

“The best architectural decision is the one you can reverse cheaply. Start with Auto Split. Profile your queries. Switch to ID-First when the numbers tell you to — not when your intuition guesses.”

— Sindika Engineering Team

The Bottom Line

EF Core is more powerful than most teams realize. The difference between a 4-second API response and a 40-millisecond one often isn't a different ORM or a different database — it's understanding how your ORM generates SQL and giving it the right hints.

Split queries fix Cartesian explosion. ID-First eliminates redundant filter execution. Recursive collection filtering makes complex domains queryable through simple dot-notation. And all of it can be automated so your team just writes ?select=OrderLines.Product.Tags.Name and the engine handles the rest.

Let's Optimize Your Data Layer

Back to all articles

Advanced EF Core Patterns in .NET 9: Split Queries and Beyond

Chapter 1: The Cartesian Explosion Problem

Chapter 2: EF Core's Built-in Answer

🤔 The Hidden Tax of Auto Split

Chapter 3: The ID-First Strategy

✅ Why ID-First Wins for Complex Queries

Chapter 4: Filtering Through Collections

Chapter 5: The Complete Query Pipeline

Parse Columns

Auto Include

Build Filters

Apply Sort

Split Decision

Execute

Post-Project

Strategy Comparison at a Glance

Chapter 6: Choosing Your Strategy

The Bottom Line