Skip to main content

Optimize Unnecessary Joins

Usage

Basic Concepts

In Jimmer SQL DSL, created join objects that are not used will be ignored. For example:

BookTable table = Tables.BOOK_TABLE;

System.out.println("Unused join: " + table.store());

List<Book> books = sqlClient
.createQuery(table)
.orderBy(table.name().asc(), table.edition().desc())
.select(table)
.execute();

Although a JOIN object is created via table.store, since it is not used by the SQL DSL of the current query, this JOIN object will be ignored.

Therefore, the final generated SQL will not contain any JOIN operations:

select
tb_1_.ID,
tb_1_.NAME,
tb_1_.EDITION,
tb_1_.PRICE,
tb_1_.STORE_ID
from BOOK as tb_1_
where tb_1_.NAME = ?

Another Solution

In

, we listed two scenarios.

In the previous doc, we discussed solving these two scenarios by utilizing Jimmer SQL DSL's automatic merge of conflicting joins.

Here, we use another approach to solve the same problems.

  • Solve the problem in scenario 1:

    List<Book> findBooks(
    @Nullable String name,
    @Nullable String storeName,
    @Nullable String storeWebsite
    ) {
    BookTable table = Tables.BOOK_TABLE;

    // Create JOIN object unconditionally first,
    // will be auto ignored if not used later,
    // so no useless JOIN
    BookStoreTable store = table.store();

    return sqlClient
    .createQuery(table)
    .whereIf(
    name != null,
    () -> book.name().like(name)
    )
    .whereIf(
    storeName != null,
    () -> store.name().like(storeName)
    )
    .whereIf(
    storeWebsite != null,
    () -> store.website().like(storeWebsite)
    )
    .select(book)
    .execute();
    }
    • If only name is specified, without storeName and storeWebsite, then store is a created but unused join object, thus ignored. The final generated SQL contains no joins.

      select
      tb_1_.ID,
      tb_1_.NAME,
      tb_1_.EDITION,
      tb_1_.PRICE,
      tb_1_.STORE_ID
      from BOOK as tb_1_
      where tb_1_.NAME = ?
    • If storeName and storeWebsite are specified, store will be used, leading to join operations in the final generated SQL. This is obvious so no example here.

    • Solve the problem in scenario 2:

      List<Long> findDistinctIds(
      @Nullable Long aId,
      @Nullable Long bId,
      @Nullable Long cId,
      @Nullable Long dId,
      @Nullable Long eId
      ) {
      ATable table = Tables.A_TABLE;

      // Create JOIN objects unconditionally first,
      // Not used join objects will be auto ignored,
      // so no useless JOIN objects
      BTableEx b = table.asTableEx().bs();
      CTableEx c = b.cs();
      DTableEx d = c.ds();
      ETableEx e = d.es();

      return sqlClient
      .createQuery(table)
      .whereIf(
      aId != null,
      () -> table.id().like(aId)
      )
      .whereIf(
      bId != null,
      () -> b.id().like(bId)
      )
      .whereIf(
      cId != null,
      () -> c.id().like(cId)
      )
      .whereIf(
      dId != null,
      () -> d.id().like(dId)
      )
      .whereIf(
      eId != null,
      () -> e.id().like(eId)
      )
      .select(book.id())
      .distinct()
      .execute();
      }
      info
      • Given the basics above, no need to list the generated SQL for different parameter combinations here. Just understand that the final SQL will not contain useless joins.

      • When a JOIN object is used by the SQL DSL, it will be marked as "used" to prevent being ignored. This mark is transitive.

        For example: Marking d as "used" will naturally also mark c, b and table.

      • asTableEx here (and various TableEx types in Java) is a concept to be introduced in later docs Pagination Safety. Please ignore it for now.

Phantom Joins

Two ways to filter on associated ids

For foreign key based associations, there are two ways to filter on the associated id:

  1. Use the foreign key property directly

    BookTable book = Tables.BOOK_TABLE;

    List<Book> books = sqlClient
    .createQuery(book)
    .where(
    book
    .storeId()
    .eq(2L)
    )
    .select(book)
    .execute();
    tip

    For one-to-one/many-to-one associations
    (like Book.store in this example), even if the user does not declare an @IdView
    property for the Book entity, the storeId property can still be used in SQL DSL.

  2. Join to the associated object first, then access the id property

    BookTable book = Tables.BOOK_TABLE;

    List<Book> books = sqlClient
    .createQuery(book)
    .where(
    book
    .store()
    .id() // only access id
    .eq(2L)
    )
    .select(book)
    .execute();
  • These two approaches are not equivalent:

    1. where(table.storeId().eq(2L)): Simply filter based on the foreign key field of the current table.

    2. where(table.store().id().eq(2L)): Actually join to the associated table first, then check the associated id value.

    They are not equivalent because:

    • The foreign key may be fake, i.e. there is no corresponding foreign key constraint in the database. This will cause non-null pseudo foreign keys to be unable to fetch non-null associated objects through joining.

    • The associated object may be affected by global filters, so it should be ignored even if it exists in the database.

  • However, the two approaches are completely equivalent when all of the following conditions are met:

    In this case, the id of the associated object is actually just the foreign key of the current table, so they are equivalent.

info

Developers should understand the difference between these two approaches and choose the right one according to the business needs.

However, when the two approaches are equivalent, optimizing the second approach into the first one is very beneficial - this is called phantom join elimination.

The equivalence of the two query methods is a precondition for phantom join elimination to take effect, so this will not be repeated below.

Inapplicable Scenarios

Phantom join elimination is only applicable to accessing the id property of associated objects, not other properties.

Here, accessing the name property of the associated object is used as an example to demonstrate the inability to eliminate the join.

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
.createQuery(book)
.where(book.store().name().eq("MANNING"))
.select(book)
.execute();

The generated SQL:

select
tb_1_.ID,
tb_1_.NAME,
tb_1_.EDITION,
tb_1_.PRICE,
tb_1_.STORE_ID
from BOOK as tb_1_
inner join BOOK_STORE as tb_2_
on tb_1_.STORE_ID = tb_2_.ID
where
tb_2_.NAME = ?

Applicable scenarios

If only the id property of the associated object is accessed, phantom join elimination can be triggered. For example:

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
.createQuery(book)
.where(
book
.store()
.id() // Access id only
.eq(2L)
)
.select(book)
.execute();

This time, the generated SQL:

select
tb_1_.ID,
tb_1_.NAME,
tb_1_.EDITION,
tb_1_.PRICE,
tb_1_.STORE_ID
from BOOK as tb_1_
where tb_1_.STORE_ID = ?

We see no table join in the SQL, only a condition tb_1_.STORE_ID = ? based on the foreign key.

The reason: For a many-to-one association mapped by foreign key, the parent table's id is actually the child table's own foreign key.

Half Joins

Half joins are a concept similar to phantom joins, but for join table based associations.

Two ways to filter on associated ids

For join table based associations, there are two ways to filter on the associated id:

  1. Check the foreign key field in the join table

    BookTable book = Tables.BOOK_TABLE;

    List<Book> books = sqlClient
    .createQuery(book)
    .where(
    book
    .asTableEx()
    .authorIds()
    .eq(2L)
    )
    .select(book)
    .execute();
    note

    The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

    caution

    For one-to-many/many-to-many associations (like Book.authors in this example), *users must declare an @IdView property *for the Book entity to access the authorIds property in SQL DSL.

  2. Join to the associated object first, then access the id property

    BookTable book = Tables.BOOK_TABLE;

    List<Book> books = sqlClient
    .createQuery(book)
    .where(
    book
    .asTableEx()
    .authors()
    .id() // only access id
    .eq(2L)
    )
    .select(book)
    .execute();
    note

    The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

  • These two approaches are not equivalent:

    1. where(table.authorIds().eq(2L)): 1 join operation

      From the current table BOOK, join to BOOK_AUTHOR_MAPPING table and directly check the BOOK_AUTHOR_MAPPING.AUTHOR_ID field.

    2. where(table.authors().id().eq(2L)): 2 join operations

      • Based on current table BOOK, join to BOOK_AUTHOR_MAPPING

      • Based on join table BOOK_AUTHOR_MAPPING, join to AUTHOR

      Finally, check the AUTHOR.ID field.

    They are not equivalent because:

    • The foreign key BOOK_AUTHOR_MAPPING.AUTHOR_ID may be fake, i.e. there is no corresponding foreign key constraint in the database. This will cause non-null pseudo foreign keys to be unable to fetch non-null associated objects through joining.

    • The associated object may be affected by global filters, so it should be ignored even if it exists in the database.

  • However, the two approaches are completely equivalent when all of the following conditions are met:

    In this case, the id of the associated object is actually just the foreign key of the current table, so they are equivalent.

info

Developers should understand the difference between these two approaches and choose the right one according to the business needs.

However, when the two approaches are equivalent, optimizing the second approach into the first one is very beneficial - this is called half join elimination.

The equivalence of the two query methods is a precondition for half join elimination to take effect, so this will not be repeated below.

Inapplicable scenarios

Half join elimination is only applicable to accessing the id property of associated objects, not other properties.

Here, accessing the Author.firstName property of the associated object is used as an example to demonstrate the inability to eliminate the second join.

BookTable book = Tables.BOOK_TABLE;

List<Long> bookIds = sqlClient
.createQuery(book)
.where(
book
.asTableEx()
.authors()
.firstName()
.eq("Alex")
)
.select(book.id())
.distinct()
.execute();
note

The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

The generated SQL:

select
distinct tb_1_.ID
from BOOK as tb_1_
inner join BOOK_AUTHOR_MAPPING as tb_2_
on tb_1_.ID = tb_2_.BOOK_ID
inner join AUTHOR as tb_3_
on tb_2_.AUTHOR_ID = tb_3_.ID
where tb_3_.FIRST_NAME = ?

We see join tables produce two SQL JOIN clauses:

  • First step: Join to intermediate table inner join BOOK_AUTHOR_MAPPING as tb_2_ on tb_1_.ID = tb_2_.BOOK_ID

  • Second step: Join to target table inner join AUTHOR as tb_3_ on tb_2_.AUTHOR_ID = tb_3_.ID

Applicable scenarios

If only the id property of the associated object is accessed, half join elimination can be triggered. For example:

BookTable book = Tables.BOOK_TABLE;

List<Long> bookIds = sqlClient
.createQuery(book)
.where(
book
.asTableEx()
.authors()
.id() // Access id only
.eq(2L)
)
.select(book.id())
.distinct()
.execute();
note

The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

This time, the generated SQL:

select
distinct tb_1_.ID
from BOOK as tb_1_
inner join BOOK_AUTHOR_MAPPING as tb_2_
on tb_1_.ID = tb_2_.BOOK_ID
where tb_2_.AUTHOR_ID = ?

This time we only see one SQL JOIN clause instead of two.

The reason: The target table's primary key is actually the foreign key from the intermediate table to the target table.