What is 90% of melanoma cancer caused by?

Approximately 90% of melanoma cases are attributed to ultraviolet (UV) radiation exposure, primarily from sunlight and artificial tanning devices. UV radiation damages the DNA of melanocytes, the pigment-producing skin cells, and repeated damage over time leads to mutations that can trigger uncontrolled cell growth. The American Academy of Dermatology identifies UV exposure as the most significant modifiable risk factor for melanoma. Genetic predisposition, fair skin type, a history of sunburn, and a large number of atypical moles also increase risk, but UV exposure remains the dominant environmental cause across the published epidemiological literature. Avoiding tanning beds and using broad-spectrum SPF 30 or higher sunscreen are the two highest-impact prevention steps.

What does the beginning of skin cancer look like?

Early skin cancer often appears as a new or changing spot that does not look like surrounding moles. The ABCDE criteria cover Asymmetry, Border irregularity, Colour variation, Diameter over 6 mm, and Evolution. A lesion that is asymmetric, has a ragged or notched border, shows multiple shades of brown, tan, black, red, or blue, measures more than 6 mm across, or has changed recently warrants prompt dermatologist review. Basal cell carcinoma often starts as a pearly or waxy bump. Squamous cell carcinoma can begin as a firm red nodule or flat scaly crust. If in doubt, a dermatologist can assess the lesion with a dermatoscope and, where available, an AI-assisted triage tool to prioritise urgent cases.

Skin Lesion Classification CNN: Architecture & Training...

As of June 10, 2025.

AI-assisted dermatology tools help patients get faster, more consistent access to specialist review. A core component of these tools is a skin lesion classifier: a CNN (Convolutional Neural Network) that takes a dermoscopy image as input and outputs a probability distribution across lesion categories. This article explains how such classifiers are typically built, covering data, architecture, training decisions, and validation — drawing on published research.

What is the HAM10000 dataset?

HAM10000 (Human Against Machine with 10,000 training images) is a publicly available benchmark of 10,015 dermoscopy images spanning seven skin lesion categories, assembled by Tschandl et al. and published in Nature Scientific Data in 2018. It remains the most widely used training corpus for skin lesion classification research as of mid-2025.

The seven classes are: melanocytic nevi (nv), melanoma (mel), benign keratosis (bkl), basal cell carcinoma (bcc), actinic keratosis (akiec), vascular lesions (vasc), and dermatofibroma (df). Images were collected across two decades from the Medical University of Vienna and Melanoma Institute Australia, covering a range of acquisition devices. The class distribution is heavily skewed, with approximately 67% of images being benign nevi — a well-known challenge for any team training a classifier on this data. The dataset is available through the ISIC (International Skin Imaging Collaboration) archive.

Why do researchers use CNN instead of a standard neural network?

A CNN (Convolutional Neural Network) applies learned spatial filters that slide across an image, detecting texture, colour gradient, and structural features like asymmetric borders regardless of where they appear in the frame. A fully connected network would need to learn these spatial relationships separately for every pixel position, requiring far more parameters and training data to achieve comparable results.

For dermoscopy images, this matters in practice. A suspicious pigment network pattern in the upper-left corner of a 450x600 image should produce the same feature activations as the same pattern centred in the frame. Convolution provides that translation invariance by design. It also means the same trained weights apply to images of different resolutions by adjusting the final pooling layer — useful when input images come from different dermatoscope models.

Published comparisons consistently show that CNN backbones outperform simple fully connected networks on HAM10000 and similar dermoscopy datasets by a substantial margin on weighted F1 before any hyperparameter tuning.

A deeper primer on how computer vision models process skin images is available in our article on how computer vision models detect skin conditions.

Which architectures are used in published skin lesion classification research?

Research teams have evaluated a range of CNN architectures pretrained on ImageNet and fine-tuned on HAM10000. Published results show EfficientNet variants and ResNet-based models among the best-performing options on the ISIC benchmarks, balancing accuracy with parameter efficiency.

Architectures commonly reported in the literature include:

Model	Year	Params (M)	Notes
ResNet-50	2015	25.6	Strong baseline, widely cited in dermatology ML research
Inception-V3	2016	23.8	Good at multi-scale features
EfficientNet-B3	2019	12.2	Favourable accuracy-to-size ratio in published results
ViT-B/16	2020	86.6	High accuracy on large datasets; parameter-heavy relative to HAM10000 scale

The Vision Transformer (ViT-B/16) typically requires more training data to generalise well. HAM10000's 10,015 images is modest by computer vision standards, which is why CNN architectures with compound scaling — like EfficientNet — appear frequently in published skin lesion classification work. The ISIC annual challenge leaderboard provides a public record of model performance across these architectures.

For context on how architecture choices interact with Fitzpatrick skin tone generalisation, see our article on training AI on diverse skin tones.

How do researchers handle class imbalance during training?

Class imbalance is the central practical challenge in skin lesion classification CNN work. When approximately 67% of training images belong to a single class (benign nevi), a naive model learns to predict that class most of the time and still achieves a passable accuracy number — while failing badly on the clinically critical minority classes like melanoma.

Two complementary techniques are widely reported in the literature: weighted cross-entropy loss (giving rare classes proportionally higher gradient signal during backpropagation) and augmentation-based oversampling for minority classes. Augmentation typically includes random flip, rotation, colour jitter, and random crop with resize to generate synthetic variants, bringing minority classes to a higher effective training count per epoch.

Published results consistently show that these adjustments improve both overall weighted F1 and per-class recall for melanoma — reducing false-negative rates meaningfully compared to training on raw class proportions. In a clinical context, missing a melanoma is far more consequential than a false-positive that prompts a follow-up, which is why recall on the melanoma class is treated as the primary optimisation target in most published work.

What training setups are typical in published research?

Published skin lesion classification papers typically use PyTorch or TensorFlow with pretrained ImageNet weights, AdamW or SGD optimisers, cosine learning rate schedules, and batch sizes between 16 and 64. Input images are commonly resized to between 224x224 and 300x300 pixels and normalised to ImageNet mean and standard deviation.

A standard data split is 80% train, 10% validation, and 10% test, stratified by class. Early stopping on validation weighted-F1 is common, and ensemble averaging across multiple training seeds is a well-established technique for reducing variance on small medical imaging datasets — consistent with findings in the dermatologist-level classification paper by Esteva et al.. FP16 mixed-precision training is now standard practice for reducing GPU memory consumption without degrading accuracy.

What does a CNN's feature maps reveal about skin lesion patterns?

In early convolutional layers, a CNN responds to low-level edge contrasts and colour boundaries. Deeper layers activate on lesion-specific patterns: irregular pigment networks, atypical vascular structures, and asymmetric colour distributions that correlate with malignancy in clinical dermoscopy criteria.

Grad-CAM (Gradient-weighted Class Activation Mapping) is a widely used interpretability technique that generates heatmaps showing which image regions most influenced each classification decision. For melanoma predictions, activation typically concentrates on areas of irregular pigment network and blue-white veil — consistent with the dermoscopic criteria described in American Academy of Dermatology (AAD) clinical guidelines at aad.org. Surfacing these heatmaps alongside confidence scores in clinical review interfaces allows dermatologists to verify which features the model weighted most heavily before confirming or overriding the classification.

The foundational work showing CNN accuracy at dermatologist level was published by Esteva et al. in Nature in 2017 at nature.com, and remains a key benchmark for dermatology ML (Machine Learning) research.

How are skin lesion classifiers validated against clinical benchmarks?

The ISIC challenge test sets provide consistent, blinded evaluation that prevents overfitting to a single institution's data distribution. Published ISIC 2019 results (25,331 images, 8 classes) show normalised multi-class accuracy scores ranging across participating teams, with top academic groups reporting results in the 0.60–0.65 range on normalised accuracy — a publicly accessible leaderboard that researchers use to compare approaches.

Dermatologist-to-dermatologist agreement on dermoscopy classification studies typically falls between 70% and 85% depending on lesion type and image quality, consistent with findings reported in the HAM10000 dataset paper. This inter-rater variability provides important context for evaluating AI model agreement rates.

AI-assisted skin lesion tools are not diagnostic devices and are not designed to replace clinician judgment. The clinical value lies in prioritisation and pre-classification — helping certified dermatologists allocate review time to the cases most likely to require urgent attention. Health Canada's regulatory position on AI-assisted medical devices is evolving; any deployed tool must operate within applicable Canadian federal guidelines at canada.ca.

Sources

Tschandl, P., Rosendahl, C. & Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific Data 5, 180161. https://www.nature.com/articles/sdata2018161
Esteva, A. et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115-118. https://www.nature.com/articles/nature21056
Esteva, A. et al. (2017). [PubMed record] https://pubmed.ncbi.nlm.nih.gov/28117445/
American Academy of Dermatology (AAD). Dermoscopy: Overview and clinical use. https://www.aad.org
Health Canada. Digital health and artificial intelligence in health care. https://www.canada.ca/en/health-canada.html

How Skin Lesion Classifiers Are Built: Architecture & Training