Training with Approved Models Files¶

Fed-BioMed offers a feature to run only the pre-approved models on the nodes. The nodes that you will be sending your model might require approved models. Therefore, the model files that are sent by a researcher should be approved by the node side in advance. In this workflow, the approval process is done by a real user/person who will review the code contained in the model file. The reviewer make sure the model doesn't contain any code that might cause privacy issues or harm the node.

In this tutorial, we will be creating a node with activated model approval option and run getting started MNIST example.

Setting Up a Node¶

Enabling model approval can be done from configuration file or Fed-BioMed CLI (command Line Interface) while starting the node. The process of creating and starting a node with model approval option is not so different than setting up a normal node. By default, if any option is not specified for the CLI, the node disables model approval option. Default security section of configuration file looks like the configuration below (under [security] sub-section).

[security]
hashing_algorithm = SHA256
allow_default_models = True
model_approval = False

The Fed-BioMed CLI can get two additional parameters as --enable-model-approval and --allow-default-models to activate model approval;

--enable-model-approval : This parameter enables model approval for the node. If there isn't a config file for the node while running CLI, it creates a new config file with enabled model approval mode model_approval = True.
--allow-default-models : This parameter allows default models for train requests. These are the models that comes for Fed-BioMed tutorials. For example, the model for MNIST dataset that we will be using for this tutorial. If the default models are enabled, node updates/registers model file which is located in envs/developments/default_models directory during starting process of the node.

You can visit documentation for model manager to have more information about managing model in the nodes.

Adding MNIST Dataset to The Node.¶

In this section we will add MNIST dataset to a new node. While adding the dataset through CLI, we'll also specify --enable-model-approval and --allow-default-models options. Now, let's run the following command.

$ {FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --enable-model-approval --allow-default-models add

The CLI will ask you to select the dataset type. Since we will be working on MNIST dataset, please select 2 (default) and continue by typing y for the next prompt and select folder that you want to store MNIST dataset. Afterward, if you go to etc directory of fedbiomed, you can see config-n1.ini file.

The above shell command will create new config-n1.ini file with following configurations:

[security]
hashing_algorithm = SHA256
allow_default_models = True
model_approval = True

Starting the Node¶

Now you can start your node by running following command;

$ {FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini start

Since, config file has been configured to enable model approval mode, you do not need to specifiy any extra parameter while starting the node. But it is also possible to start node with --enable-model-approval, --allow-default-models or --disable-model-approval, --disable-default-models. If you start your node with --disable-model-approval it will disable model approval even it is enabled in the config file.

Creating a Experiment with Approved Model File¶

In this section, we will be using default MNIST model which has been already registered by the node. We'll be creating experiment and controling whether the model file is approved or not.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.environ import environ
import tempfile
import os

tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep)
model_file = os.path.join(tmp_dir_model.name, 'class_export_mnist.py')
from fedbiomed.researcher.environ import environ import tempfile import os tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep) model_file = os.path.join(tmp_dir_model.name, 'class_export_mnist.py')

The following model is the model that will be sent to the node for traning. Since the model files are processed by the Experiment to configure dependencies, the part that imports modules might be different than this one. Therefore, it is important to get final model after initializing the experiment.

In [ ]:

  Copied!     
 
%%writefile "$model_file"

import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self):
        super(MyTrainingPlan, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms",
               "from torch.utils.data import DataLoader"]
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        
        
        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss
%%writefile "$model_file" import torch import torch.nn as nn from fedbiomed.common.torchnn import TorchTrainingPlan from torch.utils.data import DataLoader from torchvision import datasets, transforms # Here we define the model to be used. # You can use any class name (here 'Net') class MyTrainingPlan(TorchTrainingPlan): def __init__(self): super(MyTrainingPlan, self).__init__() self.conv1 = nn.Conv2d(1, 32, 3, 1) self.conv2 = nn.Conv2d(32, 64, 3, 1) self.dropout1 = nn.Dropout(0.25) self.dropout2 = nn.Dropout(0.5) self.fc1 = nn.Linear(9216, 128) self.fc2 = nn.Linear(128, 10) # Here we define the custom dependencies that will be needed by our custom Dataloader # In this case, we need the torch DataLoader classes # Since we will train on MNIST, we need datasets and transform from torchvision deps = ["from torchvision import datasets, transforms", "from torch.utils.data import DataLoader"] self.add_dependency(deps) def forward(self, x): x = self.conv1(x) x = F.relu(x) x = self.conv2(x) x = F.relu(x) x = F.max_pool2d(x, 2) x = self.dropout1(x) x = torch.flatten(x, 1) x = self.fc1(x) x = F.relu(x) x = self.dropout2(x) x = self.fc2(x) output = F.log_softmax(x, dim=1) return output def training_data(self, batch_size = 48): # Custom torch Dataloader for MNIST data transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]) dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform) train_kwargs = {'batch_size': batch_size, 'shuffle': True} data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs) return data_loader def training_step(self, data, target): output = self.forward(data) loss = torch.nn.functional.nll_loss(output, target) return loss 

To be able to get/see the final model file we need to initialize the experiment.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 2

model_args = {}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

exp = Experiment(tags=tags,
                 #nodes=None,
                 model_path=model_file,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)
from fedbiomed.researcher.experiment import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['#MNIST', '#dataset'] rounds = 2 model_args = {} training_args = { 'batch_size': 48, 'lr': 1e-3, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples } exp = Experiment(tags=tags, #nodes=None, model_path=model_file, model_args=model_args, model_class='MyTrainingPlan', training_args=training_args, rounds=rounds, aggregator=FedAverage(), node_selection_strategy=None)

Getting Final Model File From Experiment¶

model_file() Exeperiment's methd displays the model file that will be sent to the node. Even the experiment couldn't find any node to train your model, you should be able to get your final model.

In [ ]:

  Copied!     
 
exp.model_file(display=False)

# or, to get only the path where model file is saved.
# exp.model_file(display=False)
exp.model_file(display=False) # or, to get only the path where model file is saved. # exp.model_file(display=False) 

Checking Status of The Model¶

The exp.get_model_status() sends request to the nodes to check whether the model is approved or not. This method send request only to the nodes that has been found after dataset search.

In [ ]:

  Copied!     
 
status = exp.check_model_status()
status = exp.check_model_status()

The logs should indicate that the model is approved. You can also get status object from the result of the check_model_status(). it returns list of status objects each for different node. Since we have only launched single node. For this example, it will return only one status object since we have only one node.

approval_obligation : Indicates whether the model approval option is enabled in the node.
is_approved : Indicates whether the models has been approved by the node or not.

In [ ]:

  Copied!     
 
status
status

Changing Model And Testing Model Approval Status¶

Let's change our previous defaul model and test whether it is approved or not. We will be changing the network structure.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.environ import environ
import tempfile
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep)
model_file_2 = os.path.join(tmp_dir_model.name, 'class_export_mnist_2.py')
from fedbiomed.researcher.environ import environ import tempfile tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep) model_file_2 = os.path.join(tmp_dir_model.name, 'class_export_mnist_2.py')

In [ ]:

  Copied!     
 
%%writefile "$model_file_2"

import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self):
        super(MyTrainingPlan, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, 5, 1, 2)
        self.conv2 = nn.Conv2d(16, 32, 5, 1, 2)
        self.fc1 = nn.Linear(32 * 7 * 7, 10)
        deps = ["from torchvision import datasets, transforms",
               "from torch.utils.data import DataLoader"]
        
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = torch.flatten(x, 1)
        x = self.fc1(x)

        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)        
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss
%%writefile "$model_file_2" import torch import torch.nn as nn from fedbiomed.common.torchnn import TorchTrainingPlan from torch.utils.data import DataLoader from torchvision import datasets, transforms class MyTrainingPlan(TorchTrainingPlan): def __init__(self): super(MyTrainingPlan, self).__init__() self.conv1 = nn.Conv2d(1, 16, 5, 1, 2) self.conv2 = nn.Conv2d(16, 32, 5, 1, 2) self.fc1 = nn.Linear(32 * 7 * 7, 10) deps = ["from torchvision import datasets, transforms", "from torch.utils.data import DataLoader"] self.add_dependency(deps) def forward(self, x): x = self.conv1(x) x = F.relu(x) x = F.max_pool2d(x, 2) x = self.conv2(x) x = F.relu(x) x = F.max_pool2d(x, 2) x = torch.flatten(x, 1) x = self.fc1(x) output = F.log_softmax(x, dim=1) return output def training_data(self, batch_size = 48): # Custom torch Dataloader for MNIST data transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]) dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform) train_kwargs = {'batch_size': batch_size, 'shuffle': True} data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs) return data_loader def training_step(self, data, target): output = self.forward(data) loss = torch.nn.functional.nll_loss(output, target) return loss

In [ ]:

  Copied!     
 
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 2

model_args = {}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

exp2 = Experiment(tags=tags,
                 model_path=model_file_2,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)
from fedbiomed.researcher.experiment import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['#MNIST', '#dataset'] rounds = 2 model_args = {} training_args = { 'batch_size': 48, 'lr': 1e-3, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples } exp2 = Experiment(tags=tags, model_path=model_file_2, model_args=model_args, model_class='MyTrainingPlan', training_args=training_args, rounds=rounds, aggregator=FedAverage(), node_selection_strategy=None)

Since we changed the model articheture (we removed dropouts and one dense layer fc2) in this experiment, the output of the following method should say that the model is not approved by the node and is_approved key of the result object should be equal to False.

In [ ]:

  Copied!     
 
status = exp2.check_model_status()
status = exp2.check_model_status()

In [ ]:

  Copied!     
 
status
status

Since the model is not approved, you won't be able to train your model in the node.

In [ ]:

  Copied!     
 
exp2.run()
exp2.run()

In that case, you should contact the node owner and ask for model approval

Registering/Approving the Model¶

To register/approve the model that has been created in the previous section, we can use Fed-BioMed CLI. You do not need to stop your node to register new models you can perfom registration process in a different terminal window. However, first we need to create another experiment as exp3 and get the model file.

In the previous notebook cells, we tried to run a model which is not approved by the node. Therefore, your notebook kernel should have been killed. You might need to restart your kernel to be able to run your expirement. After restarting, please follow the tutorial directly from this section.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.environ import environ
import tempfile, os
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep)
model_file_3 = os.path.join(tmp_dir_model.name, 'class_export_mnist_3.py')
from fedbiomed.researcher.environ import environ import tempfile, os tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep) model_file_3 = os.path.join(tmp_dir_model.name, 'class_export_mnist_3.py')

In [ ]:

  Copied!     
 
%%writefile "$model_file_3"

import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self):
        super(MyTrainingPlan, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, 5, 1, 2)
        self.conv2 = nn.Conv2d(16, 32, 5, 1, 2)
        self.fc1 = nn.Linear(32 * 7 * 7, 10)
        deps = ["from torchvision import datasets, transforms",
               "from torch.utils.data import DataLoader"]
        
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = torch.flatten(x, 1)
        x = self.fc1(x)

        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)        
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss
%%writefile "$model_file_3" import torch import torch.nn as nn from fedbiomed.common.torchnn import TorchTrainingPlan from torch.utils.data import DataLoader from torchvision import datasets, transforms class MyTrainingPlan(TorchTrainingPlan): def __init__(self): super(MyTrainingPlan, self).__init__() self.conv1 = nn.Conv2d(1, 16, 5, 1, 2) self.conv2 = nn.Conv2d(16, 32, 5, 1, 2) self.fc1 = nn.Linear(32 * 7 * 7, 10) deps = ["from torchvision import datasets, transforms", "from torch.utils.data import DataLoader"] self.add_dependency(deps) def forward(self, x): x = self.conv1(x) x = F.relu(x) x = F.max_pool2d(x, 2) x = self.conv2(x) x = F.relu(x) x = F.max_pool2d(x, 2) x = torch.flatten(x, 1) x = self.fc1(x) output = F.log_softmax(x, dim=1) return output def training_data(self, batch_size = 48): # Custom torch Dataloader for MNIST data transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]) dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform) train_kwargs = {'batch_size': batch_size, 'shuffle': True} data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs) return data_loader def training_step(self, data, target): output = self.forward(data) loss = torch.nn.functional.nll_loss(output, target) return loss

In [ ]:

  Copied!     
 
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 2

model_args = {}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

exp3 = Experiment(tags=tags,
                 model_path=model_file_3,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)
from fedbiomed.researcher.experiment import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage tags = ['#MNIST', '#dataset'] rounds = 2 model_args = {} training_args = { 'batch_size': 48, 'lr': 1e-3, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples } exp3 = Experiment(tags=tags, model_path=model_file_3, model_args=model_args, model_class='MyTrainingPlan', training_args=training_args, rounds=rounds, aggregator=FedAverage(), node_selection_strategy=None)

In [ ]:

  Copied!     
 
exp3.model_file()
exp3.model_file()

The output of the exp3.model_file is a file path that show where the final model is saved. It also prints the content of the model file. You can either get the content of model from the output cell or the path where it is save. Anyway, you need to create a new txt file and copy the model content in it. You can create new directory in Fedi-BioMed call models and inside it you can create new my-model.txt file and copy the model content into it.

$ mkdir {FEDBIOMED_DIR}/my_approved_model
$ cp <model_path_file> {FEDBIOMED_DIR}/my_approved_model/my_model.txt

Where <model_path_file> is the path file of the model outputed by command exp3.model_file(display=False)

Afterward, please run following command in other terminal to register model file.

$ {FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --register-model

You should type a unique name for your model e.g. 'MyTestModel-1' and a description. The CLI will ask you select model file you want to register. Select the file that you saved and continue.

Now, you should be able to train your model.

In [ ]:

  Copied!     
 
exp3.check_model_status()
exp3.check_model_status()

In [ ]:

  Copied!     
 
exp3.run()
exp3.run()

Download Notebook