How to Create Your Custom PyTorch Model¶

The Fed-BioMed framework allows you to perform model training without changing your PyTorch model class completely. It only requires extra attributes and methods to train your model based on a federated approach. In this tutorial, you will learn how to write/define your model in Fed-BioMed using the PyTorch framework.

Note: Before starting this tutorial we highly recommend you to follow the previous tutorial to understand the basics of Fed-BioMed.

In this tutorial, you will learn,

What is Fed-BioMed's training plan
How to initialize your custom model
How to create your forward method
What is the method training_data and how to make your custom PyTorch DataLoader to use in tarning_data.
How to prepare your model file to make it executable by the nodes.

During this tutorial, we will be working on Celaba (CelebaFaces) dataset. You can see details of the dataset here. In the following sections, you will have the instructions for downloading and configuring Celeba dataset for Fed-BioMed framework.

1. Fed-BioMed Training Plan¶

In this section, you are going to learn how to write your custom training plan.

What is Training Plan?¶

The training plan is the class that will be constructed by each node during every round of training. In short, it defines the attributes and methods of your network to be able to train your model. TorchTrainingPlan has been designed by considering the model class of the PyTorch framework. It inherits the class called TorchTrainingPlan which has been created for extending PyTorch nn.Module. For more details, you can visit documentation for training plan. The following code snippet shows a basic training plan of Fed-BioMed for PyTorch.

class Net(TorchTrainingPlan):
    def __init__(self, kwargs):
        # ....
        pass

    def forward(self, x):
        # ...
        return

    def training_data(self,  batch_size = 48):
        # ...
        return

    def training_step(self, data, target):
        # ...
        return

`init` Method of Training Plan¶

__init__ method of the training plan is where you initialize your neural network layers same as PyTorch. This is also where you can initialize model arguments for defining layers of neural networks. In addition, you can define extra dependencies that you will need in your model class using the add_dependency method which comes from TorchTrainingPlan.

As mentioned before, we will be working on a classification model on the CelebA image dataset. The model will be able to predict if the person smiles or not. Therefore, you need to define the network's layers for this classification problem.

def __init__(self, kwargs):

    super(Net, self).__init__()
    # Convolutional layers
    self.conv1 = nn.Conv2d(3, 32, 3, 1)
    self.conv2 = nn.Conv2d(32, 32, 3, 1)
    self.conv3 = nn.Conv2d(32, 32, 3, 1)
    self.conv4 = nn.Conv2d(32, 32, 3, 1)
    self.dropout1 = nn.Dropout(0.25)
    self.dropout2 = nn.Dropout(0.5)
    # Classifier
    self.fc1 = nn.Linear(3168, 128)
    self.fc2 = nn.Linear(128, 2)

    # Here we define the custom dependencies that will be needed by our custom Dataloader
    deps = ["from torch.utils.data import Dataset, DataLoader",
            "from torchvision import transforms",
            "import pandas as pd",
            "from PIL import Image",
            "import os",
            "import numpy as np"]

    self.add_dependency(deps)

`forward()` Method¶

Next, you should define the forward method using the layers that are defined in __init__. In the forward method, we create the forwarding process from input layer to output layer of network.

def forward(self, x):

        x = self.conv1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)

        x = self.conv2(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)

        x = self.conv3(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)

        x = self.conv4(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)

        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)

        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

`training_data() and Custom Dataset`¶

training_data is an important part of the model class. Since the training plan will be performed in different nodes, training_data should process and return the data stored in the node. During each round of training, every node builds your model; gets the data using the method training_data, and performs the training_step.

The dataset that we propose to use for training is a custom image dataset. Therefore, you need to define a custom Dataset for PyTorch. To do so, a new class in the training plan using PyTorch's Dataset module has to be created.

Thanks to the Dataset module we don't load the full data of the images, we retrieve the image with the __getitem__. This doesn't impact the ram usage as much as loading every image in the dataset.

class CelebaDataset(Dataset):
        """Custom Dataset for loading CelebA face images"""


        def __init__(self, txt_path, img_dir, transform=None):

            # Read the csv file that includes classes for each image
            df = pd.read_csv(txt_path, sep="\t", index_col=0)
            self.img_dir = img_dir
            self.txt_path = txt_path
            self.img_names = df.index.values
            self.y = df['Smiling'].values
            self.transform = transform

        def __getitem__(self, index):
            img = np.asarray(Image.open(os.path.join(self.img_dir, self.img_names[index])))
            img = transforms.ToTensor()(img)
            label = self.y[index]
            return img, label

        def __len__(self):
            return self.y.shape[0]

Now, you need to define a training_data method that will create a Pytorch DataLoader using the custom CelebaDataset class.

def training_data(self,  batch_size = 48):
        # The training_data creates the Dataloader to be used for training in the general class Torchnn of Fed-BioMed
        dataset = self.CelebaDataset(self.dataset_path + "/target.csv", self.dataset_path + "/data/")
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = DataLoader(dataset, **train_kwargs)
        return data_loader

`training_step()`¶

The last method that needs to be defined is the training_step. This method is responsible of executing the forward method and calculating the loss value for the backward process of the network.

def training_step(self, data, target): 
    output = self.forward(data)
    loss   = torch.nn.functional.nll_loss(output, target)
    return loss

You are now ready to create your training plan class. All you need to do is to locate every method that has been explained in the previous sections. In the next steps we will;

download the CelebA dataset and deploy it on the nodes
define our complete model and save it as a python file
create an experiment and run it
evaluate our model using a test dataset

2.Configuring Nodes¶

We will be working with CelebA (CelebFaces) dataset. Therefore, please visit here and download the files img/img_align_celeba.zip and Anno/list_attr_celeba.txt. After the download operation is completed;

Please go to ./notebooks/data/Celeba in Fed-BioMed project.
Create Celeba_raw/raw directory and copy the list_attr_celeba.txt file.
Extract the zip file img_align_celeba.zip

Your folder should be like the tree below;

Celeba
    README.md
    create_node_data.py    
    .gitignore

    Celeba_raw
        raw
            list_attr_celeba.txt
            img_align_celeba.zip
            img_align_celeba
              lots of images

Now, the dataset has to be processed and splitted to create three distinct datasets for Node 1, Node 2, and Node 3. You can do it easily by running the following script in your notebook. Please make sure that you start your notebook in the notebooks directory of fedbiomed. Otherwise, the path that is defined in the following scripts may not work. If you are working in a different directory please make sure that you define the correct path in the following example.

Running the following scripts might take some time, please be patient.

In [ ]:

  Copied!     
 
import os
import numpy as np
import pandas as pd
import shutil

# Celeba folder
parent_dir = os.path.join(".", "data", "Celeba") 
celeba_raw_folder = os.path.join("Celeba_raw", "raw")
img_dir = os.path.join(parent_dir, celeba_raw_folder, 'img_align_celeba') + os.sep
out_dir = os.path.join(".", "data", "Celeba", "celeba_preprocessed")

# Read attribute CSV and only load Smilling column
df = pd.read_csv(os.path.join(parent_dir, celeba_raw_folder, 'list_attr_celeba.txt'),
                 sep="\s+", skiprows=1, usecols=['Smiling'])

# data is on the form : 1 if the person is smiling, -1 otherwise. we set all -1 to 0 for the model to train faster
df.loc[df['Smiling'] == -1, 'Smiling'] = 0

# Split csv in 3 part
length = len(df)
data_node_1 = df.iloc[:int(length/3)]
data_node_2 = df.iloc[int(length/3):int(length/3) * 2]
data_node_3 = df.iloc[int(length/3) * 2:]

# Create folder for each node
if not os.path.exists(os.path.join(out_dir, "data_node_1")):
    os.makedirs(os.path.join(out_dir, "data_node_1", "data"))
if not os.path.exists(os.path.join(out_dir, "data_node_2")):
    os.makedirs(os.path.join(out_dir, "data_node_2", "data"))
if not os.path.exists(os.path.join(out_dir, "data_node_3")):
    os.makedirs(os.path.join(out_dir, "data_node_3", "data"))

# Save each node's target CSV to the corect folder
data_node_1.to_csv(os.path.join(out_dir, 'data_node_1', 'target.csv'), sep='\t')
data_node_2.to_csv(os.path.join(out_dir, 'data_node_2', 'target.csv'), sep='\t')
data_node_3.to_csv(os.path.join(out_dir, 'data_node_3', 'target.csv'), sep='\t')

# Copy all images of each node in the correct folder
for im in data_node_1.index:
    shutil.copy(img_dir+im, os.path.join(out_dir,"data_node_1", "data", im))
print("data for node 1 succesfully created")

for im in data_node_2.index:
    shutil.copy(img_dir+im, os.path.join(out_dir, "data_node_2", "data", im))
print("data for node 2 succesfully created")

for im in data_node_3.index:
    shutil.copy(img_dir+im, os.path.join(out_dir, "data_node_3", "data", im))
print("data for node 3 succesfully created")
import os import numpy as np import pandas as pd import shutil # Celeba folder parent_dir = os.path.join(".", "data", "Celeba") celeba_raw_folder = os.path.join("Celeba_raw", "raw") img_dir = os.path.join(parent_dir, celeba_raw_folder, 'img_align_celeba') + os.sep out_dir = os.path.join(".", "data", "Celeba", "celeba_preprocessed") # Read attribute CSV and only load Smilling column df = pd.read_csv(os.path.join(parent_dir, celeba_raw_folder, 'list_attr_celeba.txt'), sep="\s+", skiprows=1, usecols=['Smiling']) # data is on the form : 1 if the person is smiling, -1 otherwise. we set all -1 to 0 for the model to train faster df.loc[df['Smiling'] == -1, 'Smiling'] = 0 # Split csv in 3 part length = len(df) data_node_1 = df.iloc[:int(length/3)] data_node_2 = df.iloc[int(length/3):int(length/3) * 2] data_node_3 = df.iloc[int(length/3) * 2:] # Create folder for each node if not os.path.exists(os.path.join(out_dir, "data_node_1")): os.makedirs(os.path.join(out_dir, "data_node_1", "data")) if not os.path.exists(os.path.join(out_dir, "data_node_2")): os.makedirs(os.path.join(out_dir, "data_node_2", "data")) if not os.path.exists(os.path.join(out_dir, "data_node_3")): os.makedirs(os.path.join(out_dir, "data_node_3", "data")) # Save each node's target CSV to the corect folder data_node_1.to_csv(os.path.join(out_dir, 'data_node_1', 'target.csv'), sep='\t') data_node_2.to_csv(os.path.join(out_dir, 'data_node_2', 'target.csv'), sep='\t') data_node_3.to_csv(os.path.join(out_dir, 'data_node_3', 'target.csv'), sep='\t') # Copy all images of each node in the correct folder for im in data_node_1.index: shutil.copy(img_dir+im, os.path.join(out_dir,"data_node_1", "data", im)) print("data for node 1 succesfully created") for im in data_node_2.index: shutil.copy(img_dir+im, os.path.join(out_dir, "data_node_2", "data", im)) print("data for node 2 succesfully created") for im in data_node_3.index: shutil.copy(img_dir+im, os.path.join(out_dir, "data_node_3", "data", im)) print("data for node 3 succesfully created")

Now if you go to the ${FEDBIOMED_DIR}/notebooks/data/Celaba directory you can see the folder called celeba_preprocessed. There will be three different folders that contain an image dataset for 3 nodes. The next step will be configuring the nodes and adding these datasets. We will configure only two nodes. The dataset for the third node is going to be used for testing.

Create 2 nodes for training :

${FEDBIOMED_DIR}/scripts/fedbiomed_run node config node1.ini start
${FEDBIOMED_DIR}/scripts/fedbiomed_run node config node2.ini start

Add data to each node :

${FEDBIOMED_DIR}/scripts/fedbiomed_run node config node1.ini add
${FEDBIOMED_DIR}/scripts/fedbiomed_run node config node2.ini add

Note: ${FEDBIOMED_DIR} is a path relative to based directory of the cloned Fed-BioMed repository. You can set it by running command export FEDBIOMED_DIR=/path/to/fedbiomed. This is not required for Fed-BioMed to work but enables you to run the tutorials more easily.

2.1. Configuration Steps¶

It is necessary to previously configure at least a node:

${FEDBIOMED_DIR}/scripts/fedbiomed_run node config (ini file) add
- Select option 3 (images) to add an image dataset to the node
- Add a name and the tag for the dataset (tag should contain '#celeba' as it is the tag used for this training) and finally add the description
- Pick a data folder from the 3 generated datasets inside data/Celeba/celeba_preprocessed (eg: data_node_1)
- Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
Check that your data has been added by executing ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config (ini file) list
Run the node using ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config <ini file> start. Wait until you get Starting task manager. it means you are online.

After these steps, you are ready to train your classification model over two different nodes.

3. Defining Custom PyTorch Model¶

You should set a file path where you want to save your model file. By default, in the fedbiomed.researcher.environ path is defined as 'tmp' in the base fedbiomed directory.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.environ import environ
import tempfile
import os
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep)
model_file = os.path.join(tmp_dir_model.name, 'CelebaClass.py') # name of the model class
from fedbiomed.researcher.environ import environ import tempfile import os tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep) model_file = os.path.join(tmp_dir_model.name, 'CelebaClass.py') # name of the model class

Now, it is time to create our Net class based on the methods that have been explained in the previous section. Please do not forget to add %%writefile "$model_file" command at the beginning of the following cell. This command allows to write the script into the file. Thanks to that, experiment can access the model file and uploads it to the file repository to make it accessible for the nodes. The nodes get the model file from the file repository and do the training part based on the model defined in the Net class.

In [ ]:

  Copied!     
 
%%writefile "$model_file"

import torch
import torch.nn as nn
from  fedbiomed.common.torchnn import TorchTrainingPlan
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
import pandas as pd
import numpy as np
from PIL import Image
import os


class Net(TorchTrainingPlan):
    def __init__(self):
        super(Net, self).__init__()
        # Convolutional layers
        self.conv1 = nn.Conv2d(3, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 32, 3, 1)
        self.conv3 = nn.Conv2d(32, 32, 3, 1)
        self.conv4 = nn.Conv2d(32, 32, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        # Classifier
        self.fc1 = nn.Linear(3168, 128)
        self.fc2 = nn.Linear(128, 2)
        
        deps = ["from torch.utils.data import Dataset, DataLoader",
                "from torchvision import transforms",
                "import pandas as pd",
               "from PIL import Image",
               "import os",
               "import numpy as np"]
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)
        
        x = self.conv2(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)

        x = self.conv3(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)

        x = self.conv4(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)
        
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output


    class CelebaDataset(Dataset):
        """Custom Dataset for loading CelebA face images"""
        
        def __init__(self, txt_path, img_dir, transform=None):
            df = pd.read_csv(txt_path, sep="\t", index_col=0)
            self.img_dir = img_dir
            self.txt_path = txt_path
            self.img_names = df.index.values
            self.y = df['Smiling'].values
            self.transform = transform
            print("celeba dataset finished")

        def __getitem__(self, index):
            img = np.asarray(Image.open(os.path.join(self.img_dir,
                                        self.img_names[index])))
            img = transforms.ToTensor()(img)
            label = self.y[index]
            return img, label

        def __len__(self):
            return self.y.shape[0]
    
    def training_data(self,  batch_size = 48):
        # The training_data creates the Dataloader to be used for training in the general class Torchnn of fedbiomed
        dataset = self.CelebaDataset(os.path.join(self.dataset_path, "target.csv"), 
                                     os.path.join(self.dataset_path, "data")+os.sep)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = DataLoader(dataset, **train_kwargs)
        return data_loader
    
    def training_step(self, data, target):
        #this function must return the loss to backward it 
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss
%%writefile "$model_file" import torch import torch.nn as nn from fedbiomed.common.torchnn import TorchTrainingPlan import torch.nn.functional as F from torch.utils.data import DataLoader from torchvision import transforms from torch.utils.data import Dataset, DataLoader import pandas as pd import numpy as np from PIL import Image import os class Net(TorchTrainingPlan): def __init__(self): super(Net, self).__init__() # Convolutional layers self.conv1 = nn.Conv2d(3, 32, 3, 1) self.conv2 = nn.Conv2d(32, 32, 3, 1) self.conv3 = nn.Conv2d(32, 32, 3, 1) self.conv4 = nn.Conv2d(32, 32, 3, 1) self.dropout1 = nn.Dropout(0.25) self.dropout2 = nn.Dropout(0.5) # Classifier self.fc1 = nn.Linear(3168, 128) self.fc2 = nn.Linear(128, 2) deps = ["from torch.utils.data import Dataset, DataLoader", "from torchvision import transforms", "import pandas as pd", "from PIL import Image", "import os", "import numpy as np"] self.add_dependency(deps) def forward(self, x): x = self.conv1(x) x = F.max_pool2d(x, 2) x = F.relu(x) x = self.conv2(x) x = F.max_pool2d(x, 2) x = F.relu(x) x = self.conv3(x) x = F.max_pool2d(x, 2) x = F.relu(x) x = self.conv4(x) x = F.max_pool2d(x, 2) x = F.relu(x) x = self.dropout1(x) x = torch.flatten(x, 1) x = self.fc1(x) x = F.relu(x) x = self.dropout2(x) x = self.fc2(x) output = F.log_softmax(x, dim=1) return output class CelebaDataset(Dataset): """Custom Dataset for loading CelebA face images""" def __init__(self, txt_path, img_dir, transform=None): df = pd.read_csv(txt_path, sep="\t", index_col=0) self.img_dir = img_dir self.txt_path = txt_path self.img_names = df.index.values self.y = df['Smiling'].values self.transform = transform print("celeba dataset finished") def __getitem__(self, index): img = np.asarray(Image.open(os.path.join(self.img_dir, self.img_names[index]))) img = transforms.ToTensor()(img) label = self.y[index] return img, label def __len__(self): return self.y.shape[0] def training_data(self, batch_size = 48): # The training_data creates the Dataloader to be used for training in the general class Torchnn of fedbiomed dataset = self.CelebaDataset(os.path.join(self.dataset_path, "target.csv"), os.path.join(self.dataset_path, "data")+os.sep) train_kwargs = {'batch_size': batch_size, 'shuffle': True} data_loader = DataLoader(dataset, **train_kwargs) return data_loader def training_step(self, data, target): #this function must return the loss to backward it output = self.forward(data) loss = torch.nn.functional.nll_loss(output, target) return loss 

This group of arguments corresponds respectively to:

model_args: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node-side.
training_args: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node-side.

Note: Typos and/or lack of positional (required) arguments might raise an error.

In [ ]:

  Copied!     
 
training_args = {
    'batch_size': 32, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
training_args = { 'batch_size': 32, 'lr': 1e-3, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples }

4. Training Federated Model¶

To provide training orchestration over two nodes we need to difene an experiment. The experiment:

searches nodes serving data for the tags,
define the local training on nodes with the model saved in model_path, and federate all local updates at each round with aggregator
runs training for rounds.

You can visit user guide to know much more about experiment.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#celeba']
rounds = 3

exp = Experiment(tags=tags,
                 model_path=model_file,
                 model_class='Net',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)
from fedbiomed.researcher.experiment import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['#celeba'] rounds = 3 exp = Experiment(tags=tags, model_path=model_file, model_class='Net', training_args=training_args, rounds=rounds, aggregator=FedAverage(), node_selection_strategy=None)

Let's start the experiment.

By default, this function doesn't stop until all the rounds are done for all the nodes. While the experiment runs you can open the terminals where you have started the nodes and see the training progress. However, the loss values obtained from each node during the training will be printed as output in real time. Since we are working on an image dataset, training might take some time.

In [ ]:

  Copied!     
 
exp.run()
exp.run()

Loading Training Parameters¶

After all the rounds have been completed, you retrieve the aggregated parameters from the last round and load them.

In [ ]:

  Copied!     
 
fed_model = exp.model_instance
fed_model.load_state_dict(exp.aggregated_params[rounds - 1]['params'])
fed_model = exp.model_instance fed_model.load_state_dict(exp.aggregated_params[rounds - 1]['params'])

5. Testing Federated Model¶

We will define a testing routine to extract the accuracy metrics on the testing dataset. We will use the dataset that has been extracted into data_node_3.

In [ ]:

  Copied!     
 
import torch
import torch.nn as nn

import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
import pandas as pd
import numpy as np
from PIL import Image
import os

def testing_Accuracy(model, data_loader):
    model.eval()
    test_loss = 0
    correct = 0
    device = "cpu"
    
    loader_size = len(data_loader)
    with torch.no_grad():
        for idx, (data, target) in enumerate(data_loader):
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()
            
            # Only uses 10% of the dataset, results are similar but faster
            if idx >= loader_size / 10:
                pass
                break

    
        pred = output.argmax(dim=1, keepdim=True)

    test_loss /= len(data_loader.dataset)
    accuracy = 100* correct/(data_loader.batch_size * idx)

    return(test_loss, accuracy)
import torch import torch.nn as nn import torch.nn.functional as F from torch.utils.data import DataLoader from torchvision import transforms from torch.utils.data import Dataset, DataLoader import pandas as pd import numpy as np from PIL import Image import os def testing_Accuracy(model, data_loader): model.eval() test_loss = 0 correct = 0 device = "cpu" loader_size = len(data_loader) with torch.no_grad(): for idx, (data, target) in enumerate(data_loader): data, target = data.to(device), target.to(device) output = model(data) test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss pred = output.argmax(dim=1, keepdim=True) # get the index of the max log-probability correct += pred.eq(target.view_as(pred)).sum().item() # Only uses 10% of the dataset, results are similar but faster if idx >= loader_size / 10: pass break pred = output.argmax(dim=1, keepdim=True) test_loss /= len(data_loader.dataset) accuracy = 100* correct/(data_loader.batch_size * idx) return(test_loss, accuracy)

We also need to define a custom Dataset class for the test dataset in order to load it using PyTorch's DataLoader. This will be the same class that has been already defined in the training plan.

In [ ]:

  Copied!     
 
test_dataset_path = "./data/Celeba/celeba_preprocessed/data_node_3"

class CelebaDataset(Dataset):
    """Custom Dataset for loading CelebA face images"""

    def __init__(self, txt_path, img_dir, transform=None):
        df = pd.read_csv(txt_path, sep="\t", index_col=0)
        self.img_dir = img_dir
        self.txt_path = txt_path
        self.img_names = df.index.values
        self.y = df['Smiling'].values
        self.transform = transform
        print("celeba dataset finished")

    def __getitem__(self, index):
        img = np.asarray(Image.open(os.path.join(self.img_dir,
                                        self.img_names[index])))
        img = transforms.ToTensor()(img)
        label = self.y[index]
        return img, label

    def __len__(self):
        return self.y.shape[0]
    

dataset = CelebaDataset(os.path.join(test_dataset_path, "target.csv"),
                        os.path.join(test_dataset_path, "data") +os.sep)
train_kwargs = {'batch_size': 128, 'shuffle': True}
data_loader = DataLoader(dataset, **train_kwargs)
 test_dataset_path = "./data/Celeba/celeba_preprocessed/data_node_3" class CelebaDataset(Dataset): """Custom Dataset for loading CelebA face images""" def __init__(self, txt_path, img_dir, transform=None): df = pd.read_csv(txt_path, sep="\t", index_col=0) self.img_dir = img_dir self.txt_path = txt_path self.img_names = df.index.values self.y = df['Smiling'].values self.transform = transform print("celeba dataset finished") def __getitem__(self, index): img = np.asarray(Image.open(os.path.join(self.img_dir, self.img_names[index]))) img = transforms.ToTensor()(img) label = self.y[index] return img, label def __len__(self): return self.y.shape[0] dataset = CelebaDataset(os.path.join(test_dataset_path, "target.csv"), os.path.join(test_dataset_path, "data") +os.sep) train_kwargs = {'batch_size': 128, 'shuffle': True} data_loader = DataLoader(dataset, **train_kwargs)

In [ ]:

  Copied!     
 
acc_federated = testing_Accuracy(fed_model, data_loader)
acc_federated[1]
acc_federated = testing_Accuracy(fed_model, data_loader) acc_federated[1]

Conclusions¶

In this tutorial, running a custom model on Fed-BioMed using the PyTorch framework has been explained. Because the examples are designed for the development environment, we have been running nodes in the same host machine. In production, the nodes that you need to use to train your model will serve in remote servers. Since Fed-BioMed is still in the development phase, in future there might be updates in the function and the methods of these tutorials. Therefore, please keep you updated from our GitLab repository.

Download Notebook

How to Create Your Custom PyTorch Model¶

1. Fed-BioMed Training Plan¶

What is Training Plan?¶

__init__ Method of Training Plan¶

forward() Method¶

training_data() and Custom Dataset¶

training_step()¶

2.Configuring Nodes¶

2.1. Configuration Steps¶

3. Defining Custom PyTorch Model¶

4. Training Federated Model¶

Loading Training Parameters¶

5. Testing Federated Model¶

Conclusions¶

`init` Method of Training Plan¶

`forward()` Method¶

`training_data() and Custom Dataset`¶

`training_step()`¶