3. A brief history of gorilla classification
Author & Year Scientific name
Savage
1847
Troglodytes gorilla
(Pan gorilla)
I. Geoffroy St. Hilaire
1952
Gorilla gorilla
Tuttle
1967
Pan gorilla
Groves
1967
Gorilla gorilla gorilla
homonym
synonym
split / merge
4. A matter of opinion
Taxonomy A:
Loxodonta africana
Taxonomy B:
Loxodonta africana
Loxodonta cyclotis
7. WITH endemic_taxon_concepts AS (
SELECT taxon_concept_id
FROM distributions
GROUP BY taxon_concept_id
HAVING COUNT(*) = 1
), countries_with_endemic_distributions AS (
SELECT d.geo_entity_id, COUNT(d.taxon_concept_id) AS cnt
FROM distributions d
INNER JOIN endemic_taxon_concepts q
ON d.taxon_concept_id = q.taxon_concept_id
GROUP BY d.geo_entity_id
)
SELECT geo_entities.name_en, cnt
FROM countries_with_endemic_distributions q
INNER JOIN geo_entities ON geo_entities.id = q.geo_entity_id
ORDER BY cnt DESC
8. name cnt
Indonesia 1353
Mexico 1069
Madagascar 970
Australia 886
Brazil 763
Ecuador 564
Papua New Guinea 561
South Africa 532
United States of America 520
9. Data-modifying CTE's
WITH deactivated_geo_entities AS (
UPDATE geo_entities SET is_active = FALSE
WHERE id IN (#{old_geo_entity_ids})
RETURNING id
)
UPDATE distributions
SET geo_entity_id = #{new_geo_entity_id}
FROM deactivated_geo_entities
WHERE distributions.geo_entity_id = deactivated_geo_entities.id
CTE = materialize by design
10. #2: Recursive CTE's
WITH RECURSIVE name [ (columns) ] AS (
non-recursive term
UNION [ALL]
recursive term
)
primary query
11. WITH RECURSIVE self_and_descendants (id, full_name) AS (
SELECT id, full_name FROM taxon_concepts
WHERE id = 472
UNION
SELECT hi.id, hi.full_name FROM taxon_concepts hi
JOIN self_and_descendants d ON d.id = hi.parent_id
)
SELECT COUNT(*) FROM self_and_descendants
count
432
12. WITH RECURSIVE self_and_ancestors (
parent_id, full_name, level
) AS (
SELECT parent_id, full_name, 1
FROM taxon_concepts WHERE id = 5563
UNION
SELECT hi.parent_id, hi.full_name, q.level + 1
FROM taxon_concepts hi
JOIN self_and_ancestors q ON hi.id = q.parent_id
)
SELECT full_name
FROM self_and_ancestors ORDER BY level DESC
13. WITH crocodile_ancestry AS (
WITH RECURSIVE self_and_ancestors (
-- [AS IN PREVIOUS SLIDE]
)
)
SELECT ARRAY_TO_STRING(ARRAY_AGG(full_name), ' > ')
AS breadcrumb FROM crocodile_ancestry
breadcrumb
Animalia > Chordata > Reptilia > Crocodylia >
Crocodylidae > Crocodylus > Crocodylus niloticus
16. WITH RECURSIVE cascading_refs(taxon_concept_id, exclusions) AS (
SELECT h.id, h_refs.excluded_taxon_concepts_ids
FROM taxon_concepts h
LEFT JOIN taxon_concept_references h_refs ON h_refs.taxon_concept_id = h.id
WHERE h.id = 10 AND h_refs.reference_id = 369
UNION
SELECT hi.id, cascading_refs.exclusions
FROM taxon_concepts hi
JOIN cascading_refs ON cascading_refs.taxon_concept_id = hi.parent_id
WHERE NOT COALESCE(cascading_refs.exclusions, ARRAY[]::INT[]) @>
ARRAY[hi.id]
)
UPDATE taxon_concepts SET has_std_ref = TRUE
FROM cascading_refs
WHERE cascading_refs.taxon_concept_id = taxon_concepts.id
17. #3: Window functions
SELECT ROW_NUMBER() OVER(ORDER BY full_name), full_name
FROM taxon_concepts
WHERE parent_id = 335 ORDER BY full_name
row_number full_name
1 Canis
2 Cerdocyon
3 Chrysocyon
4 Cuon
5 Dusicyon
18. WITH RECURSIVE q(id, full_name, path) AS (
SELECT id, full_name, ARRAY[1] FROM taxon_concepts h
WHERE id = 335
UNION
SELECT hi.id, hi.full_name,
q.path || (
ROW_NUMBER() OVER(
PARTITION BY parent_id ORDER BY hi.full_name
)
)::INT
FROM taxon_concepts hi
JOIN q ON hi.parent_id = q.id
)
SELECT path, full_name FROM q
ORDER BY path
CTE + window function
21. SQL Antipatterns: Avoiding the Pitfalls of Database
Programming Bill Karwin
PostgreSQL: Up and Running Regina Obe, Leo Hsu
High Performance SQL with PostgreSQL 8.4
https://github.com/unepwcmc/SAPI
Checklist of CITES Species
Biodiversity Information Standards (TDWG)
Items freed into the public domain Pearson Scott
Foresman
PostgreSQL
Code & Demo
Graphics
Taxonomy