Training with Secure Aggregation¶

Secure aggregation is one of the security feature that is provided by Fed-BioMed. Please refer to secure aggregation user guide for more information regarding the methods and techniques that are used. This tutorial gives an example of secure aggregation usage in Fed-BioMed.

Setting up the nodes¶

During the tutorial, nodes and researcher will be launched locally using single clone of Fed-BioMed. However, it is also possible to execute notebook cells when the components are configured remotely by respecting following instruction.

Start the network¶

Before running this notebook, start the network with ./scripts/fedbiomed_run network

Configuring/Installing Element for Secure Aggregation¶

You can follow the detailed instructions for configuring Fed-BioMed instance for secure aggregation or apply following shortened instructions for a basic setup.

1. Install and configure¶

Fed-BioMed uses MP-SPDZ for MPC. Therefore, please make sure that MP-SPDZ are installed and configured for Fed-BioMed by running following command.

${FEDBIOMED_DIR}/scripts/fedbiomed_confgiure_secagg node

Since node and researcher will be run in the same machine, single configuration for MP-SDPZ will enouhg

2. Create node and researcher instances¶

The setup for secure aggregation requires knowledge of the participating Fed-BioMed components in advance. Therefore, each component that will participate in the training should be created before starting them. Afterwards, participating components can be registered in every other component.

2.1¶

It is mandatory to have at least two nodes for the experiment that requires secure aggregation. Please execute following commands to create two nodes.

Node 1:

${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini configuration create

Node 2:

${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n2.ini configuration create

2.2 Create researcher¶

Please run the command below to create researcher component.

${FEDBIOMED_DIR}/scripts/fedbiomed_run researcher configuration create

3. Registering participating Fed-BioMed instances¶

Normally, as it is mentioned in secure aggregation configuration each participating instance should register network credentials of others such as IP, port and SSL certificate. however, since this example will be run on single clone of Fed-BioMed, registration process can be done automaticity by running following command.

${FEDBIOMED_DIR}/scripts/fedbiomed_run certicate-dev-setup

4. Add dataset and start nodes¶

The next step will be adding/deploying MNIST dataset in the nodes and starting them. For these step you can follow the instructions for adding dataset into nodes to add MNIST dataset. After the datasets are deployed you can start the nodes and researcher.

Define an experiment model and parameters"¶

Declare a torch training plan MyTrainingPlan class to send for training on the node

In [ ]:

  Copied!     
 
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms


# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    
    # Defines and return model 
    def init_model(self, model_args):
        return self.Net(model_args = model_args)
    
    # Defines and return optimizer
    def init_optimizer(self, optimizer_args):
        return torch.optim.Adam(self.model().parameters(), lr = optimizer_args["lr"])
    
    # Declares and return dependencies
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms"]
        return deps
    
    class Net(nn.Module):
        def __init__(self, model_args):
            super().__init__()
            self.conv1 = nn.Conv2d(1, 32, 3, 1)
            self.conv2 = nn.Conv2d(32, 64, 3, 1)
            self.dropout1 = nn.Dropout(0.25)
            self.dropout2 = nn.Dropout(0.5)
            self.fc1 = nn.Linear(9216, 128)
            self.fc2 = nn.Linear(128, 10)

        def forward(self, x):
            x = self.conv1(x)
            x = F.relu(x)
            x = self.conv2(x)
            x = F.relu(x)
            x = F.max_pool2d(x, 2)
            x = self.dropout1(x)
            x = torch.flatten(x, 1)
            x = self.fc1(x)
            x = F.relu(x)
            x = self.dropout2(x)
            x = self.fc2(x)


            output = F.log_softmax(x, dim=1)
            return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        return DataManager(dataset=dataset1, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.model().forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss
import torch import torch.nn as nn from fedbiomed.common.training_plans import TorchTrainingPlan from fedbiomed.common.data import DataManager from torchvision import datasets, transforms # Here we define the model to be used. # You can use any class name (here 'Net') class MyTrainingPlan(TorchTrainingPlan): # Defines and return model def init_model(self, model_args): return self.Net(model_args = model_args) # Defines and return optimizer def init_optimizer(self, optimizer_args): return torch.optim.Adam(self.model().parameters(), lr = optimizer_args["lr"]) # Declares and return dependencies def init_dependencies(self): deps = ["from torchvision import datasets, transforms"] return deps class Net(nn.Module): def __init__(self, model_args): super().__init__() self.conv1 = nn.Conv2d(1, 32, 3, 1) self.conv2 = nn.Conv2d(32, 64, 3, 1) self.dropout1 = nn.Dropout(0.25) self.dropout2 = nn.Dropout(0.5) self.fc1 = nn.Linear(9216, 128) self.fc2 = nn.Linear(128, 10) def forward(self, x): x = self.conv1(x) x = F.relu(x) x = self.conv2(x) x = F.relu(x) x = F.max_pool2d(x, 2) x = self.dropout1(x) x = torch.flatten(x, 1) x = self.fc1(x) x = F.relu(x) x = self.dropout2(x) x = self.fc2(x) output = F.log_softmax(x, dim=1) return output def training_data(self, batch_size = 48): # Custom torch Dataloader for MNIST data transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]) dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform) train_kwargs = {'batch_size': batch_size, 'shuffle': True} return DataManager(dataset=dataset1, **train_kwargs) def training_step(self, data, target): output = self.model().forward(data) loss = torch.nn.functional.nll_loss(output, target) return loss 

This group of arguments correspond respectively:

model_args: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side.
training_args: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.

NOTE: typos and/or lack of positional (required) arguments will raise error. 🤓

In [ ]:

  Copied!     
 
model_args = {}

training_args = {
    'batch_size': 48, 
    'optimizer_args': {
        "lr" : 1e-3
    },
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}
model_args = {} training_args = { 'batch_size': 48, 'optimizer_args': { "lr" : 1e-3 }, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples }

Declare and run the experiment¶

In [ ]:

  Copied!     
 
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
from fedbiomed.researcher.secagg import SecureAggregation
tags =  ['#MNIST', '#dataset']
rounds = 2

exp = Experiment(tags=tags,
                 model_args=model_args,
                 training_plan_class=MyTrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None,
                 secagg=True, # or custom SecureAggregation(active=<bool>, clipping_range=<int>, timeout=<int>)
                 save_breakpoints=True)
from fedbiomed.researcher.experiment import Experiment from fedbiomed.researcher.aggregators.fedavg import FedAverage from fedbiomed.researcher.secagg import SecureAggregation tags = ['#MNIST', '#dataset'] rounds = 2 exp = Experiment(tags=tags, model_args=model_args, training_plan_class=MyTrainingPlan, training_args=training_args, round_limit=rounds, aggregator=FedAverage(), node_selection_strategy=None, secagg=True, # or custom SecureAggregation(active=, clipping_range=, timeout=) save_breakpoints=True)

Access secure aggregation context¶

Please use the attribute secagg to verify secure aggregation is set as active

In [ ]:

  Copied!     
 
print("Is using secagg: ", exp.secagg.active)
print("Is using secagg: ", exp.secagg.active)

It is also possible to check secure aggregation context using secagg attribute. Since secure aggregation context negotiation will occur during experiment run, context and id should be None at this point.

In [ ]:

  Copied!     
 
print("Secagg Biprime ", exp.secagg.biprime)
print("Secagg Servkey ", exp.secagg.servkey)
print("Secagg Biprime ", exp.secagg.biprime) print("Secagg Servkey ", exp.secagg.servkey)

Run the experiment, using secure aggregation. Secure aggregation context will be created before the first training round, and it is going to be updated before each round when new nodes are added or removed to the experiment.

In [ ]:

  Copied!     
 
exp.run(increase=True)
exp.run(increase=True)

Display context after running one round of training.

In [ ]:

  Copied!     
 
print("Secagg Biprime context: ", exp.secagg.biprime.context)
print("Secagg Servkey context: ", exp.secagg.servkey.context)
print("Secagg Biprime context: ", exp.secagg.biprime.context) print("Secagg Servkey context: ", exp.secagg.servkey.context)

Changes in experiment triggers re-creation of secure aggregation context¶

The changes that re-create jobs like adding new node to the experiment will trigger automatic secure aggregation re-setup for the next round.

In [ ]:

  Copied!     
 
# sends new dataset search request
from fedbiomed.researcher.strategies import DefaultStrategy
from fedbiomed.researcher.aggregators.fedavg import FedAverage
exp.set_training_data(None, True)
exp.set_strategy(DefaultStrategy)
exp.set_aggregator(FedAverage)
exp.set_job()
# sends new dataset search request from fedbiomed.researcher.strategies import DefaultStrategy from fedbiomed.researcher.aggregators.fedavg import FedAverage exp.set_training_data(None, True) exp.set_strategy(DefaultStrategy) exp.set_aggregator(FedAverage) exp.set_job()

In [ ]:

  Copied!     
 
exp.run_once(increase=True)
exp.run_once(increase=True)

Changing arguments of secure aggregation¶

Setting secagg argument True in Experiment creates a default SecureAggregation instance. Additionally, It is also possible to create SecureAggregation instance and pass it as an argument. Here are the arguments that can be set for the SecureAggregation

active: True if the round will use secure aggregation. Default is True
clipping_range: Clipping range that is going be use for quantization of model parameters. Default clipping range is 3. However, some models can have model weigths greater than 3. If clipping range is exceeded during the encryption on the nodes, Experiment will log a warning message. In such cases, you can provide a higher clipping range through the argument clipping_range.
timeout: Timeout is the maximum amount of time, in seconds, that the experiment will wait for responses from all parties during secure aggregation setup. Since secure aggregation context depends on network communication and multi-party computation, this argument allows setting higher timeout for larger context setups, or vice versa.

In [ ]:

  Copied!     
 
from fedbiomed.researcher.secagg import SecureAggregation
secagg = SecureAggregation(
    active=True, 
    clipping_range=100,
    timeout=15
    
)
exp.set_secagg(secagg=secagg)
from fedbiomed.researcher.secagg import SecureAggregation secagg = SecureAggregation( active=True, clipping_range=100, timeout=15 ) exp.set_secagg(secagg=secagg) 

In [ ]:

  Copied!     
 
exp.run_once(increase=True)
exp.run_once(increase=True)

Load experiment from a breakpoint¶

Once a breakpoint is loadded if the context is already exsiting there won't be context setup.

In [ ]:

  Copied!     
 
loaded_exp = Experiment.load_breakpoint()
loaded_exp.info()
loaded_exp = Experiment.load_breakpoint() loaded_exp.info()

In [ ]:

  Copied!     
 
loaded_exp.run_once(increase=True)
loaded_exp.run_once(increase=True)

Download Notebook