In a Spring Boot + JPA project, I encountered a bug where order line items were duplicated 2x or 3x in the API response. The DB data was perfectly fine — the duplication only appeared in the API layer. A tricky one to debug.
The root cause: two @OneToMany collections were JOIN FETCHed in a single JPQL query. This post covers why it happens, why DISTINCT doesn’t fix it, and the proper solution.
Symptoms: “The data is correct, but the UI is wrong”
After generating April orders in a SaaS subscription management system, something was off in the list view:

Every customer’s product names appeared twice, and amounts were exactly doubled. The detail view made it even clearer:

- Premium Plan $52.00 × 2
- Cloud Storage $25.00 × 2
- Total: $146.00 (inflated)
But here’s the interesting clue:

The payment link showed $69.00 — the correct amount. This value was stored directly in the DB.
Debugging
Step 1: Check the DB
Queried the database directly.
SELECT oli.id, oli.product_name, oli.price
FROM order_line_item oli
WHERE oli.order_id = 369;

Only 2 rows. No duplicates in the DB.
Step 2: Check the frontend
The React component was simply doing order.lineItems.map(). It renders whatever data it receives. Not the culprit.
Step 3: The API response is the problem
Duplication happens between DB and API response. Time to check the JPA query.
Root Cause: Cartesian Product from Two Collections
The problematic query:
@Query("SELECT o FROM Order o " +
"LEFT JOIN FETCH o.lineItems " + // @OneToMany List
"LEFT JOIN FETCH o.appliedDiscounts " + // @OneToMany Set
"JOIN FETCH o.customer c " +
"WHERE o.billingYear = :year AND o.billingMonth = :month " +
"ORDER BY c.name")
List<Order> findAllByMonth(...);
lineItems (List) and appliedDiscounts (Set) are JOIN FETCHed simultaneously.
The SQL Result
The generated SQL looks like:
SELECT o.*, oli.*, oad.*
FROM orders o
LEFT JOIN order_line_item oli ON oli.order_id = o.id
LEFT JOIN order_applied_discount oad ON oad.order_id = o.id
WHERE ...
With 2 lineItems and 2 appliedDiscounts for Order 369:

2 × 2 = 4 rows. This is a cartesian product.
How Hibernate Handles the Collections
When Hibernate maps these 4 rows back to entities:
Set(appliedDiscounts): Auto-deduplicates viaequals/hashCode→ 2 entries (correct)List(lineItems): Bag semantics — no deduplication → 4 entries (oli.id 385, 385, 386, 386)
List is an ordered collection. Hibernate adds each SQL row’s line item to the List as-is. When the same entity appears in multiple rows, the same object gets added multiple times.
DISTINCT Doesn’t Fix It
First attempt — add SELECT DISTINCT:
@Query("SELECT DISTINCT o FROM Order o " +
"LEFT JOIN FETCH o.lineItems " +
"LEFT JOIN FETCH o.appliedDiscounts " + ...)
No effect. Here’s why:
In Hibernate 6, JPQL DISTINCT deduplicates the root entity (Order) in the result list. It prevents the same Order object from appearing multiple times.
But it doesn’t touch the lineItems List inside each Order. Order 369 appears once in the result, but its lineItems collection already has 4 entries.
Solution: Separate the Collection Fetches
Core principle: Never JOIN FETCH multiple @OneToMany collections in a single query.
The Fix: @BatchSize
// Order.java
@OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
@BatchSize(size = 50) // Loaded via separate IN query
private Set<OrderAppliedDiscount> appliedDiscounts = new LinkedHashSet<>();
// OrderRepository.java — removed appliedDiscounts JOIN FETCH
@Query("SELECT DISTINCT o FROM Order o " +
"LEFT JOIN FETCH o.lineItems " +
// LEFT JOIN FETCH o.appliedDiscounts — removed!
"JOIN FETCH o.customer c " +
"WHERE o.billingYear = :year AND o.billingMonth = :month ...")
List<Order> findAllByMonth(...);
Now the execution flow:
- Query 1: Order + lineItems JOIN FETCH → rows = number of lineItems (no duplication)
- Query 2 (automatic):
SELECT * FROM order_applied_discount WHERE order_id IN (?, ?, ..., ?)→ batches of 50
One query became two, but the cartesian product is eliminated and data is accurate.
Alternatives

Key Takeaways
JOIN FETCHing multiple @OneToMany collections in one query
→ SQL Cartesian Product → List(bag) duplication → inflated data
- DB is fine, API is wrong → Check JPA collection fetch strategy
DISTINCTonly deduplicates root entities, not List contentsSetauto-deduplicates,List(bag) does not- Fix: Fetch one collection per query +
@BatchSizefor the rest
Comments