Java and databases use SQL joins

Table of Contents

Introduction

Let's imagine we have two Tables now, people and addresses. We want to find out how many people live at the same address. This is our current Java and SQL approach:

repository/PeopleRepository.java

    @Override
    @SQL(value = FIND_BY_ID_SQL, operationType = CrudOperation.FIND_BY_ID)
    @SQL(value = FIND_ALL_SQL, operationType = CrudOperation.FIND_ALL)
    @SQL(value = SELECT_COUNT_SQL, operationType =  CrudOperation.COUNT)
    @SQL(value = DELETE_IN_SQL, operationType = CrudOperation.DELETE_MANY)
    @SQL(value = DELETE_SQL, operationType = CrudOperation.DELETE_ONE)
    Person extractEntityFromResultSet(ResultSet rs) throws SQLException {
        long personID = rs.getLong("ID");
        String firstName = rs.getString("FIRST_NAME");
        String lastName = rs.getString("LAST_NAME");
        ZonedDateTime dob = ZonedDateTime.of(rs.getTimestamp("DOB").toLocalDateTime(), ZoneId.of("+0"));
        BigDecimal salary = rs.getBigDecimal("SALARY");
        long homeAddressId = rs.getLong("HOME_ADDRESS");
        Optional<Address> homeAddress = addressRepository.findById(homeAddressId);
        Person person = new Person(personID, firstName, lastName, dob, salary);
        person.setHomeAddress(homeAddress.orElse(null));
        return person;
    }

repository/AddressRepository.java

    @Override
    @SQL(operationType = CrudOperation.FIND_BY_ID, value= """
            SELECT ID, STREET_ADDRESS, ADDRESS2, CITY, STATE, POSTCODE, COUNTY, REGION, COUNTRY
            FROM ADDRESSES
            WHERE ID = ?
            """)
    Address extractEntityFromResultSet(ResultSet rs) throws SQLException {
        long id = rs.getLong("ID");
        String streetAddress = rs.getString("STREET_ADDRESS");
        String address2 = rs.getString("ADDRESS2");
        String city = rs.getString("CITY");
        String state = rs.getString("STATE");
        String postcode = rs.getString("POSTCODE");
        String county = rs.getString("COUNTY");
        Region region = Region.valueOf(rs.getString("REGION").toUpperCase());
        String country = rs.getString("COUNTRY");
        Address address = new Address(id, streetAddress, address2, city, state, postcode, county, country, region);
        return address;
    }

The problem with this current approve, is that we would need two SQL queries for each person record, one to get the Person and one to get the address. Since Java methods to retrieve data from an SQL DB are declared as expensive operation in terms of time and performance efficiency this is not a good approach. So how could we optimize this?

One possible answer to this problem are making use of SQL JOIN.

What is a SQL JOIN

A JOIN combines records from two tables. JOIN matches related column values in two tables. A query can contain zero, one, or multiple JOIN operations. You can already imagine that this allows us to retrieve all our wanted data by just one SQL JOIN query.

How does it work

Let's look at some sample data:

SELECT * FROM PEOPLE WHERE HOME_ADDRESS = 2;

output: 

     ID  FIRST_NAME  LAST_NAME           DOB           SALARY  DELETE_FLAG  EMAIL   HOME_ADDRESS
-------  ----------  ---------  ---------------------  ------  -----------  ------  ------------
5001254  jooonyy     Smith      1982-09-13 00:51:54.0       0  <null>       <null>             2

as we can see this record has the address foreign key  2 and if we query for this record, we get this output:

SELECT * FROM ADDRESSES WHERE ID = 2;

output: 

ID  STREET_ADDRESS  ADDRESS2  CITY       STATE  POSTCODE  COUNTY         REGION  COUNTRY      
--  --------------  --------  ---------  -----  --------  -------------  ------  -------------
 2  123 Bale St.    Apt. 1A   Wala Wala  WA     90210     Fulton County  WEST    United States

Now, when we are joining multiple tables together, it is very common to make use of something called table aliases. And that's where we can simply give our tables and additional name or alias that we can use to refer to those individual tables columns.

The reason for this is because when you're joining two or more tables together, some of the columns of those tables may have the same names. So, for example, our people table has an ID column, so does our addresses table. So if you wanted to be able to retrieve various information from either of those particular columns, how could you specify which ID column you're talking about? The people one or the addresses one?Well, you can prefix the ID column of either of these tables with an alias.

So let's see how that would look like:

SELECT * FROM PEOPLE AS PEOPLE JOIN ADDRESSES AS ADDRESS ON PEOPLE.HOME_ADDRESS = ADDRESS.ID;

output:

     ID  FIRST_NAME  LAST_NAME           DOB           SALARY  DELETE_FLAG  EMAIL   HOME_ADDRESS  ID  STREET_ADDRESS  ADDRESS2  CITY       STATE  POSTCODE  COUNTY         REGION  COUNTRY      
-------  ----------  ---------  ---------------------  ------  -----------  ------  ------------  --  --------------  --------  ---------  -----  --------  -------------  ------  -------------
5001254  jooonyy     Smith      1982-09-13 00:51:54.0       0  <null>       <null>             2   2  123 Bale St.    Apt. 1A   Wala Wala  WA     90210     Fulton County  WEST    United States

It's one result set row that's combining all of the columns of the people table with all the columns of the address table for this one person. You may be wondering why did we only get one row because we didn't say anything like where home address equals four or something like that. So why did we only get one row? Well, the reason we only got one row in this particular case is because of the type of joint that is executing here. And this particular, joined by default, is what's known as an inner join. All an inner join is is that there has to be a match that satisfies this expression here on both tables, in order for a result set to be returned. Since there's only one record in the people table that has a foreign key that points to an address, and there's only one address that is being used by the people table, we only got back one result set in this particular case.

However, what if we wanted to be able to retrieve any or all people, whether or not they have a home address specified? And in fact, for our purposes in our code, that's exactly what we're going to need because when we find a person by ID, they might have an address and they might not. We don't want them to not show up just because they did not have an address. So to make that work, that's where we have to perform the left outer join, so that we can make sure that regardless of whether or not there's an address being pointed to, we always get back all of the people that we're looking for. So to do that, all we have to do is specify that we want a left outer join.

SELECT * FROM PEOPLE AS PEOPLE LEFT OUTER JOIN ADDRESSES AS ADDRESS ON PEOPLE.HOME_ADDRESS = ADDRESS.ID FETCH FIRST 10 ROWS ONLY;

output:

ID  FIRST_NAME  LAST_NAME           DOB           SALARY  DELETE_FLAG  EMAIL   HOME_ADDRESS      ID  STREET_ADDRESS  ADDRESS2  CITY    STATE   POSTCODE  COUNTY  REGION  COUNTRY
--  ----------  ---------  ---------------------  ------  -----------  ------  ------------  ------  --------------  --------  ------  ------  --------  ------  ------  -------
12  Harry       Jonson     1950-03-15 10:45:10.0  100000  false        <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
14  Jack        Jonson     2000-01-10 08:45:10.0   50000  false        <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
15  Mary        Jonson     2005-05-13 17:30:10.0   20000  false        <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
16  Sabrina     Jonson     1951-10-31 19:13:43.0  200000  false        <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
18  Jake        Smith      1970-10-31 19:13:43.0   75000  true         <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
19  John        Smith      1980-11-15 21:15:00.0  <null>  <null>       <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
20  John        Smith      1980-11-15 21:15:00.0  <null>  <null>       <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
21  Bobby       Smith      1982-09-13 00:51:54.0  <null>  <null>       <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
22  John        Smith      1980-11-15 21:15:00.0  <null>  <null>       <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 
23  John        Smith      1980-11-15 21:15:00.0  <null>  <null>       <null>        <null>  <null>  <null>          <null>    <null>  <null>  <null>    <null>  <null>  <null> 

But for our situation we need something like this:

SELECT P.FIRST_NAME,P.LAST_NAME,A.STREET_ADDRESS,A.CITY FROM PEOPLE AS P LEFT OUTER JOIN ADDRESSES AS A ON P.HOME_ADDRESS = A.ID WHERE P.ID = 5001254;

output:

FIRST_NAME  LAST_NAME  STREET_ADDRESS  CITY     
----------  ---------  --------------  ---------
jooonyy     Smith      123 Bale St.    Wala Wala

Lets have a look how this will finally look like in our Java code:

repository/PeopleRepository.java

    public static final String FIND_BY_ID_SQL = """
        SELECT 
        P.ID, P.FIRST_NAME, P.LAST_NAME, P.DOB, P.SALARY, P.HOME_ADDRESS,
        HOME.ID as HOME_ID, HOME.STREET_ADDRESS as HOME_STREET_ADDRESS, HOME.ADDRESS2 as HOME_ADDRESS2, HOME.CITY as HOME_CITY, HOME.STATE as HOME_STATE, HOME.POSTCODE as HOME_POSTCODE, HOME.COUNTY as HOME_COUNTY, HOME.REGION as HOME_REGION, HOME.COUNTRY as HOME_COUNTRY,
        BIZ.ID as BIZ_ID, BIZ.STREET_ADDRESS as BIZ_STREET_ADDRESS, BIZ.ADDRESS2 as BIZ_ADDRESS2, BIZ.CITY as BIZ_CITY, BIZ.STATE as BIZ_STATE, BIZ.POSTCODE as BIZ_POSTCODE, BIZ.COUNTY as BIZ_COUNTY, BIZ.REGION as BIZ_REGION, BIZ.COUNTRY as BIZ_COUNTRY,
        FROM PEOPLE AS P
        LEFT OUTER JOIN ADDRESSES AS HOME ON P.HOME_ADDRESS = HOME.ID
        LEFT OUTER JOIN ADDRESSES AS BIZ ON P.BIZ_ADDRESS = BIZ.ID
        WHERE P.ID=?""";
    @Override
    @SQL(value = FIND_BY_ID_SQL, operationType = CrudOperation.FIND_BY_ID)
    @SQL(value = FIND_ALL_SQL, operationType = CrudOperation.FIND_ALL)
    @SQL(value = SELECT_COUNT_SQL, operationType =  CrudOperation.COUNT)
    @SQL(value = DELETE_IN_SQL, operationType = CrudOperation.DELETE_MANY)
    @SQL(value = DELETE_SQL, operationType = CrudOperation.DELETE_ONE)
    Person extractEntityFromResultSet(ResultSet rs) throws SQLException {
        long personID = rs.getLong("ID");
        String firstName = rs.getString("FIRST_NAME");
        String lastName = rs.getString("LAST_NAME");
        ZonedDateTime dob = ZonedDateTime.of(rs.getTimestamp("DOB").toLocalDateTime(), ZoneId.of("+0"));
        BigDecimal salary = rs.getBigDecimal("SALARY");
//        long homeAddressId = rs.getLong("HOME_ADDRESS");

        Address homeAddress = extractAddress(rs, "HOME_");
        Address bizAddress = extractAddress(rs, "BIZ_");

        Person person = new Person(personID, firstName, lastName, dob, salary);
        person.setHomeAddress(homeAddress);
        person.setBusinessAddress(bizAddress);
        return person;
    }

    private Address extractAddress(ResultSet rs, String aliasPrefix) throws SQLException {
        Long addrId = getValueByAlias(aliasPrefix + "ID", rs, Long.class);
        if (addrId == null) return null;
        String streetAddress = getValueByAlias(aliasPrefix + "STREET_ADDRESS", rs, String.class);
        String address2 = getValueByAlias(aliasPrefix + "ADDRESS2", rs, String.class);
        String city = getValueByAlias(aliasPrefix + "CITY", rs, String.class);
        String state = getValueByAlias(aliasPrefix + "STATE", rs, String.class);
        String postcode = getValueByAlias(aliasPrefix + "POSTCODE", rs, String.class);
        String county = getValueByAlias(aliasPrefix + "COUNTY", rs, String.class);
        Region region = Region.valueOf(getValueByAlias(aliasPrefix + "REGION", rs, String.class).toUpperCase());
        String country = getValueByAlias(aliasPrefix + "COUNTRY", rs, String.class);
        Address address = new Address(addrId, streetAddress, address2, city, state, postcode, country, county, region);
        return address;
    }

    private <T> T getValueByAlias(String alias, ResultSet rs, Class<T> clazz) throws SQLException {
        int columnCount = rs.getMetaData().getColumnCount();
        for (int colIdX = 1; colIdX < columnCount; colIdX++) {
            if (alias.equals(rs.getMetaData().getColumnLabel(colIdX))) {
                return (T) rs.getObject(colIdX);
            }
        }
        throw new SQLException(String.format("Column not found for alias: '%s'", alias));
    }