Listagg distinct Examples | How to remove duplicates using Listagg distinct in SQL?

In my previous article I have given the details about how user can remove duplicate records in SQL? In this article I would like to give information about one of the important function in Oracle SQL which is Listagg with distinct keyword. The Listagg distinct function is one of the key important function in Oracle SQL which has introduced in oracle 11 G which used to aggregate the results set in multiple rows in to one single column. In this article I would like to explain the query to remove duplicates using Listagg in SQL.

Development of Listagg function in Oracle versions :

The Listagg function has been introduced in Oracle 11 G where it gives the aggregate result set in multiple rows in one single column. The Listagg function feature again enhanced and in Oracle 12 C adding ON OVERFLOW TRUNCATE clause to manage situation of concatenated string length.

Now In oracle 19c Listagg function further enhanced and added distinct clause to remove duplicate records from the list. The function will give you ability to remove the records in easy steps rather than using very complex query for removing duplicates. It exclusively used to remove duplicates from the list in easy way.

Delete duplicate Records
Remove duplicate records

The following table has the duplicate values :

Query :

   select d.deptno, dname,
   listagg(job, ',') within group (order by job) jobs_in_dept
   from dept d, emp e
   where e.deptno = d.deptno
   group by d.deptno, dname;


    DEPTNO DNAME          JOBS_IN_DEPT
---------- -------------- --------------------------------------------------
        10 IT             BI Engineer,Manager,VP
        20 R&D            IT Engineer, IT Engineer,QA,QA,MANAGER
        30 Marketing      Market Head,MANAGER,Market Head

The above table will give you the records in the list which is duplicate records. If someone asked you the question like if user wants to find out the Department name and Jobs in department in distinct then you require to write very complex query to achieve that. But using the new feature of listagg function you can eliminate the duplicate records from the list.

Query to remove duplicates using the Listagg distinct function :

     select d.deptno, dname,
     listagg(distinct job, ',') within group (order by job) jobs_in_dept
     from dept d, emp e
     where e.deptno = d.deptno
     group by d.deptno, dname;

The output of the same :


    DEPTNO DNAME          JOBS_IN_DEPT
---------- -------------- --------------------------------------------------
        10 IT             BI Engineer,Manager,VP
        20 R&D            IT Engineer,QA,MANAGER
        30 Marketing      Market Head,MANAGER

The Jobs_in_dept are distinct and the list is now distinct records.

More ways to remove duplicates :

Before the Listagg function with distinct keyword we require to use the complex queries to remove duplicates from the column list. In Lower versions of Oracle we can not remove duplicate values with using the listagg and distinct together as the feature is introduced in oracle 19 c. We can remove duplicates using the nested query in lower versions of oracle.

Query :

select deptno, dname,
listagg(job, ',') within group (order by job) jobs_in_dept
from
 (
    select distinct
    d.deptno, d.dname, e.job
    from dept d, emp e
    where e.deptno = d.deptno
  )
   group by deptno, dname;

Output :
   DEPTNO DNAME          JOBS_IN_DEPT
---------- -------------- --------------------------------------------------
        10 IT             BI Engineer,Manager,VP
        20 R&D            IT Engineer,QA,MANAGER
        30 Marketing      Market Head,MANAGER

We can also use the Regexp_replace function to remove duplicates.

Query :

with data as
  (
   select d.deptno, dname,
   listagg(job, ',') within group (order by job) jobs_in_dept
   from dept d, emp e
   where e.deptno = d.deptno
   group by d.deptno, dname
   )
   select deptno, dname,
   regexp_replace(jobs_in_dept,'([^,]+)(,\1)+', '\1') jobs_in_dept
  from data;

Output :
   DEPTNO DNAME          JOBS_IN_DEPT
---------- -------------- --------------------------------------------------
        10 IT             BI Engineer,Manager,VP
        20 R&D            IT Engineer,QA,MANAGER
        30 Marketing      Market Head,MANAGER

These are above few ways to remove the duplicates from the list. The easiest way to remove duplicates from the list is using listagg distinct. I hope you find this article useful. If you find the article useful or if you have any concerns with the same kindly comment in comments section.

Leave a Reply

Your email address will not be published.