상관서브쿼리에서 서브쿼리내부 GROUP BY 생략가능 여부

Question

안녕하세요. 강의실습을 진행하다가 궁금한 사항이 있어 문의드립니다. 가장최근 급여정보를 비상관으로 풀이한 쿼리를 보면 서브쿼리내에 GROUP BY가 생략된 것 같아서요 select * from hr.emp_salary_hist a where todate = ( select max (todate) from hr.emp_salary_hist x where a.empno = x.empno); 서브쿼리내 where a.empno = x.empno 에 의해 group by 가 없어도 직원별 max(todate)를 가져오게 된건가요? 그러면 아래의 쿼리에서도 group by를 생략해도 되는건가요? -- 2건 이상 주문을 한 고객 정보 select * from nw.customers a where exists ( select 1 from nw.orders x where x.customer_id = a.customer_id group by customer_id having count (*) >=2);

권 철민 · Answer

안녕하십니까, select * from hr.emp_salary_hist a where todate = ( select max (todate) from hr.emp_salary_hist x where a.empno = x.empno); 서브쿼리내 where a.empno = x.empno 에 의해 group by 가 없어도 직원별 max(todate)를 가져오게 된건가요? => 네 맞습니다. 그리고 group by가 없어도 된다기 보다, 오히려 group by 를 안쓰시는게 더 명확한 SQL이 될 수 있습니다. max()와 같은 aggregation이 반드시 group by 필요하지는 않습니다. 전체 데이터에도 aggregation함수를 적용할 수 있습니다. 그러니까 서브쿼리 ( select max (todate) from hr.emp_salary_hist x where a.empno = x.empno)는 max(todate)가 없다면 where a.empno = x.empno에 의해서 여러건이 만들어지게 되지만 max(todate)를 통해 단 한건의 가장 최근 todate를 반환하게 됩니다. 굳이 group by x.empno를 하실 필요가 없습니다. 하지만 아래는 다릅니다. select * from nw.customers a where exists ( select 1 from nw.orders x where x.customer_id = a.customer_id group by customer_id having count (*) >=2); 명백하게 서브쿼리에서 group by customer로 하였을 때 count(*)가 2개 이상인 customer_id를 filtering 요구하고 있습니다. 서브쿼리내의 where x.customer_id = a.customer_id 조건으로 결과가 2건이 안되는 customer_id가 나올 수 있습니다. 때문에 위 SQL은 Group by 를 제거해서는 안됩니다. 감사합니다.