Documentation
3. Image Classification

Image Classification Guide

This guide walks you through creating and training an image classification model using AnyLearning. Image classification is a fundamental computer vision task where the goal is to categorize images into predefined classes. For example, you might want to classify medical X-ray images as normal or showing different types of pneumonia, or classify different species of flowers.

🚀 Step 1: Create a Project

First, create a new project specifically for image classification:

  1. Click on "New Project" button
  2. Select "Image Classification" as the project type
  3. Give your project a meaningful name and description

Data Preprocessing

📊 Step 2: Data Preparation

🏷️ 2.1. Create the Label Set

The label set defines all possible classes that your model will learn to distinguish between. For example, in a medical X-ray classification project, your classes might be "NORMAL", "PNEUMONIA_BACTERIA", and "PNEUMONIA_VIRUS".

To create your label set:

  1. Navigate to the "Overview" tab
  2. Enter each class name individually in the input field
  3. Click "+" after each class name
  4. Ensure class names are descriptive and consistent

Data Preprocessing

📁 2.2. Upload the Datasets

Go to the "Dataset" tab to manage your datasets.

For effective model training, you need to split your data into three sets:

  • Training set: The largest portion (typically 70-80%) used to train the model
  • Validation set: A smaller portion (typically 10-15%) used to tune hyperparameters and prevent overfitting
  • Test set: The remaining portion (typically 10-15%) used to evaluate the final model performance

Data Preprocessing

Upload Process:

  1. Navigate to the "Dataset" tab
  2. Compress each main folder (training, validation, test) into separate zip files
  3. Use the respective upload buttons for each set
  4. Wait for the upload and verification process to complete

Trial Dataset: We prepared a trial dataset for you to get started. You can download it from here (opens in a new tab).

💡 Important: Use different images for training, validation, and testing to ensure accurate model evaluation.

🔧 Step 3: Model Training

Training Configuration:

  1. Go to the "Training" tab
  2. Click "New Training Session"
  3. Configure the following hyperparameters:
    • Batch size: Number of images processed together (typically 32 or 64)
    • Learning rate: Controls how much the model adjusts its weights (typically 0.001)
    • Epochs: How many times the model will see the entire dataset
    • Model Variant: Choose the model architecture, balanced for speed and accuracy
    • Pretrained: Choose default or fine-tune from a pre-trained model
  4. Click "Start Training" to begin the process

Data Preprocessing

Monitor Training Progress:

  • View all training sessions in the "Training" tab
  • Click on any session to see detailed information Data Preprocessing

Training Metrics and Logs:

  • 📈 Monitor loss values to ensure the model is learning
  • ✅ Check accuracy metrics on validation data
  • 📝 View training logs for detailed progress information
  • ⚠️ Watch for signs of overfitting (validation metrics getting worse)

Data Preprocessing

🧪 Step 4: Test the Trained Model

After training completes, validate your model's performance:

  1. Go to the "Model" tab
  2. Click the "Try" button
  3. Upload test images that weren't used in training
  4. Analyze the model's predictions

Data Preprocessing

The model will display its prediction along with confidence scores for each class: Data Preprocessing

📦 Step 5: Export the model and use with your code

Click on the download button to download the trained model. You can choose the raw Pytorch model or its ONNX conversion. The inference code is shown below.

5.1. Raw (Pytorch) model usage

  • Install the necessary libraries:
pip install numpy torch torchvision Pillow
  • Run the code:
import torch
from torchvision import transforms
from PIL import Image
 
MODEL_PATH = "best_model.pth" # Raw (Pytorch) model file
IMG_SIZE = 416
CLASS_NAMES = ["NORMAL", "PNEUMONIA_BACTERIA", "PNEUMONIA_VIRUS"] # The class names that have been defined in the Overview tab
IMAGE_PATH = "test_image.jpeg"
 
 
def get_transformations(img_size):
    transform = transforms.Compose(
        [
            transforms.Resize((img_size, img_size)),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
    return transform
 
model = torch.load(MODEL_PATH)
model.eval()
transform = get_transformations(IMG_SIZE)
pil_image = Image.open(IMAGE_PATH).convert("RGB")
 
inp = transform(pil_image)
inp = inp.unsqueeze(0)
with torch.no_grad():
    output = torch.softmax(model(inp)[0], dim=-1)
 
# Make a dictionary (class name -> prediction)
predictions = {}
for i, class_name in enumerate(CLASS_NAMES):
    predictions[class_name] = output[i].item()
 
# Sort the predictions by confidence
predictions = dict(sorted(predictions.items(), key=lambda item: item[1], reverse=True))
 
# display top 5 predictions (if number of classes is less than 5, display all)
top_k = min(5, len(predictions))
top_k_predictions = {k: predictions[k] for k in list(predictions)[:top_k]}
result =  {f'top_{top_k}_class_probability': top_k_predictions}
 
print(result)

5.1. Exported (ONNX) model usage

  • Install the necessary libraries:
pip install numpy torch torchvision onnxruntime Pillow
  • Run the code:
import onnxruntime as ort
import numpy as np
from PIL import Image
import torchvision.transforms as transforms
 
MODEL_PATH = "exported_model.onnx" # Exported ONNX model file
IMG_SIZE = 416
CLASS_NAMES = ["NORMAL", "PNEUMONIA_BACTERIA", "PNEUMONIA_VIRUS"] # The class names that have been defined in the Overview tab
IMAGE_PATH = "test_image.jpeg"
 
def softmax(x):
    max_x = np.max(x, axis=0)
    return np.exp(x - max_x) / np.sum(np.exp(x - max_x), axis=0)
 
def get_transformations(img_size):
    transform = transforms.Compose(
        [
            transforms.Resize((img_size, img_size)),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
    return transform
 
# Load the ONNX model
ort_session = ort.InferenceSession(MODEL_PATH)
 
# Prepare the input
transform = get_transformations(IMG_SIZE)
pil_image = Image.open(IMAGE_PATH).convert("RGB")
inp = transform(pil_image)
inp = inp.unsqueeze(0).numpy()
 
# Run inference
outputs = ort_session.run(None, {ort_session.get_inputs()[0].name: inp})
output = softmax(outputs[0][0])
 
# Make a dictionary (class name -> prediction)
predictions = {}
for i, class_name in enumerate(CLASS_NAMES):
    predictions[class_name] = output[i].item()
 
# Sort the predictions by confidence
predictions = dict(sorted(predictions.items(), key=lambda item: item[1], reverse=True))
 
# display top 5 predictions (if number of classes is less than 5, display all)
top_k = min(5, len(predictions))
top_k_predictions = {k: predictions[k] for k in list(predictions)[:top_k]}
result =  {f'top_{top_k}_class_probability': top_k_predictions}
 
print(result)