What is Remove duplicates
Although the rows in a table should always be unique, when you select only a subset of the columns, the result rows may not be unique even if the original rows are. For example, you may have a table of suppliers with a requirement the city and state (or province) be unique so that there will never be more than one supplier in any city. However, if you just want to see the cities and countries/regions where suppliers are located, the returned results may not be unique. Suppose you write the following query:
SQLCopy
SELECT City, CountryRegion
FROM Production.Supplier
ORDER BY CountryRegion, City;
This query may return results similar to the following:
City
CountryRegion
Aurora
Canada
Barrie
Canada
Brampton
Canada
Brossard
Canada
Brossard
Canada
Burnaby
Canada
Burnaby
Canada
Burnaby
Canada
Calgary
Canada
Calgary
Canada
…
…
By default, the SELECT clause includes an implicit ALL keyword that results in this behavior:
SQLCopy
SELECT ALL City, CountryRegion
FROM Production.Supplier
ORDER BY CountryRegion, City;
T-SQL also supports an alternative the DISTINCT keyword, which removes any duplicate result rows:
SQLCopy
SELECT DISTINCT City, CountryRegion
FROM Production.Supplier
ORDER BY CountryRegion, City;
When using DISTINCT, the example returns only one of each unique combination of values in the SELECT list:
City
CountryRegion
Aurora
Canada
Barrie
Canada
Brampton
Canada
Brossard
Canada
Burnaby
Canada
Calgary
Canada
…
…