Dominant tree species information is fundamental to forest management — species differ in ecological roles, economic value, and cultural importance. Traditional field surveys supply these data but are costly and spatially limited. Airborne Laser Scanning (ALS) now enables modelling of forest attributes, including dominant species, across large areas with high structural detail.
This project develops and evaluates a Python-based workflow to predict dominant tree species at the stand level in the Petawawa Research Forest (PRF), Ontario, using ALS-derived structural metrics and a Random Forest classifier. A central research question is whether ALS data alone is sufficient for species classification, or whether fusing it with Landsat 8 multispectral imagery substantially improves accuracy.
Topographic metrics (slope and aspect) were derived from the PRF 2012 DTM using ArcPy's Spatial Analyst extension. Seven stand-level metrics from the 2018 EFI surfaces — basal area, total aboveground biomass (summed across four size classes), dominant/codominant height, quadratic mean DBH, Lorey's height, stand density, and gross total volume — were loaded as raster layers. Landsat 8 Bands 2–7 (Blue through SWIR2) were included as multispectral predictors; the coastal aerosol band was excluded.
All 15 feature layers (7 stand metrics + 6 spectral bands + slope + aspect) were stacked into a composite raster. Ground plot observations with fewer than 3 samples were grouped into an "Other" class to avoid model instability. A RandomForestClassifier was trained on a 70/30 train-validation split and applied pixel-by-pixel to the full landscape composite to produce a wall-to-wall species map.
Using ALS structural metrics alone produced low overall accuracy. Adding Landsat 8 multispectral bands substantially improved per-class precision and recall, confirming that data fusion is necessary for reliable species classification.