Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP integrating Learning in evolution #84

Open
wants to merge 50 commits into
base: development
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
cf345d0
Simplified Deserialisation method.
etiennegalea Nov 19, 2019
afcf01c
Merge remote-tracking branch 'origin/development' into development
etiennegalea Nov 24, 2019
6645efb
Created new protobuf message
etiennegalea Nov 24, 2019
deeede0
Skeleton for new Learner architecture
Nov 25, 2019
354051e
All classes for learner abstraction in place now
Nov 27, 2019
dee5dfb
Added hardcoded parameters to BayesianOptimizer.
Nov 27, 2019
114a213
Merge remote-tracking branch 'upstream/development' into development
Nov 28, 2019
fc88765
Refactoring
Nov 29, 2019
5363462
Plugging in NoLearner
Nov 29, 2019
86f9eea
Added new robot_states.proto for learning evaluations
etiennegalea Nov 29, 2019
f5aeda1
Merge remote-tracking branch 'origin/development' into development
etiennegalea Nov 29, 2019
71b5314
devectorize_cpg_controller and refactor
Dec 2, 2019
4040ee4
implement optimization step and fitness saving.
Dec 2, 2019
b7c6dfe
Plug new learner into robot controller
Dec 3, 2019
5b5719f
Merge branch 'bayes_opt' into evaluator_proto_msg
etiennegalea Dec 3, 2019
7e87217
Bayesian Optimization integration
portaloffreedom Dec 4, 2019
8c77222
fix limbo compilation issue
portaloffreedom Dec 5, 2019
bc90434
added Hyperneat learner - WIP
portaloffreedom Dec 5, 2019
2011271
Merge branch 'learning' into evaluator_proto_msg
etiennegalea Dec 5, 2019
471459d
Evaluation Reporter WIP
portaloffreedom Dec 5, 2019
8d373bc
Merge remote-tracking branch 'upstream/learning' into evaluator_proto…
etiennegalea Dec 5, 2019
2c590e9
node pubblisher for gazebo reporter
portaloffreedom Dec 5, 2019
99d02e1
Merge remote-tracking branch 'upstream/learning' into evaluator_proto…
etiennegalea Dec 5, 2019
c0b20d6
improved hyperneat
portaloffreedom Dec 5, 2019
55f9cf3
Added protobuf construction and sending to GazeboReporter. Changed fl…
etiennegalea Dec 5, 2019
c4a3ad9
Merge remote-tracking branch 'etienne/evaluator_proto_msg' into learning
portaloffreedom Dec 5, 2019
fcfd933
Changed protobuf msg ID to string
etiennegalea Dec 5, 2019
fc428cd
Merge remote-tracking branch 'etienne/evaluator_proto_msg' into learning
portaloffreedom Dec 5, 2019
79a2e39
GazeboReporter implementation ready
portaloffreedom Dec 5, 2019
e616cfb
improve learning classes
portaloffreedom Dec 6, 2019
b370be1
removed old pygazebo and improved .gitignore
portaloffreedom Dec 6, 2019
ddb361e
WIP: new CPG BO loading integration
portaloffreedom Dec 9, 2019
beff4fd
Merge branch 'learning' of github.com:ci-group/revolve into learning
portaloffreedom Dec 9, 2019
ac3f28d
Fixed brain imports as it lower version of python results in errors.
etiennegalea Dec 10, 2019
4ef63a8
Debugging
Dec 10, 2019
f7c4a9b
Merge branch 'learning' of github.com:etiennegalea/revolve into learning
Dec 10, 2019
7f3f102
Fixed learning
portaloffreedom Dec 11, 2019
aae3a2d
implemented brain
portaloffreedom Dec 11, 2019
abb2478
new plugin that supports reports is a work in progress...
portaloffreedom Dec 12, 2019
7e8c595
Implemented new world and robot manager for the learning steps
portaloffreedom Jan 2, 2020
3d994f1
Fixed PIGPIO compile warning
portaloffreedom Mar 10, 2020
3aa3349
removed old code
portaloffreedom Mar 10, 2020
d9571cb
Fixed MultiNEAT Neural Network bug
portaloffreedom Mar 10, 2020
abdb362
Targeted locomotion for robots (HW first)
portaloffreedom Mar 10, 2020
809bc9e
catch up with local
Apr 24, 2020
5b027b7
General:
Oct 7, 2020
5b9c719
General:
Oct 7, 2020
a931e5d
General:
Oct 7, 2020
265bf5b
General:
Oct 7, 2020
2411596
Merge GECCO paper branch from Fuda van Diggelen
DaanZ Feb 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
improve learning classes
portaloffreedom committed Dec 6, 2019
commit e616cfb8a6326aec19ab9fe541b1afdb3e169326
106 changes: 44 additions & 62 deletions cpprevolve/revolve/brains/learner/BayesianOptimizer.cpp
Original file line number Diff line number Diff line change
@@ -25,24 +25,21 @@ BayesianOptimizer::BayesianOptimizer(
Evaluator *evaluator,
EvaluationReporter *reporter,
const double evaluation_time,
const size_t n_learning_evalutions)
: Learner(evaluator, reporter)
const unsigned int n_learning_evalutions)
: Learner(evaluator, reporter, evaluation_time, n_learning_evalutions)
, _controller(std::move(controller))
, evaluation_time(evaluation_time)
, evaluation_end_time(-1)
, n_learning_iterations(n_learning_evalutions)
, n_init_samples(1)
//, init_method("LHS")
, kernel_noise(0.00000001)
, kernel_optimize_noise("false")
, kernel_sigma_sq(0.222)
, kernel_l(0.55)
, kernel_squared_exp_ard_k(4)
, acqui_gpucb_delta(0.5)
, acqui_ucb_alpha(0.44)
, acqui_ei_jitter(0)
, acquisition_function("UCB")
{
this->n_init_samples = 1;
//this->init_method = "LHS";
this->kernel_noise = 0.00000001;
this->kernel_optimize_noise = "false";
this->kernel_sigma_sq = 0.222;
this->kernel_l = 0.55;
this->kernel_squared_exp_ard_k = 4;
this->acqui_gpucb_delta = 0.5;
this->acqui_ucb_alpha = 0.44;
this->acqui_ei_jitter = 0;
this->acquisition_function = "UCB";

if (typeid(this->_controller) == typeid(std::unique_ptr<revolve::DifferentialCPG>)) {
devectorize_controller = [this](Eigen::VectorXd weights) {
@@ -52,12 +49,12 @@ BayesianOptimizer::BayesianOptimizer(
std_weights[j] = weights(j);
}

auto *temp_controller = dynamic_cast<::revolve::DifferentialCPG *>(this->controller());
auto *temp_controller = dynamic_cast<::revolve::DifferentialCPG *>(this->_controller.get());
temp_controller->set_connection_weights(std_weights);
};

vectorize_controller = [this]() {
auto *controller = dynamic_cast<::revolve::DifferentialCPG *>( this->controller());
auto *controller = dynamic_cast<::revolve::DifferentialCPG *>(this->_controller.get());
const std::vector<double> &weights = controller->get_connection_weights();

// std::vector -> Eigen::Vector
@@ -171,56 +168,41 @@ BO_DECLARE_DYN_PARAM(double, BayesianOptimizer::params::acqui_ucb, alpha);
BO_DECLARE_DYN_PARAM(double, BayesianOptimizer::params::kernel_maternfivehalves, sigma_sq);
BO_DECLARE_DYN_PARAM(double, BayesianOptimizer::params::kernel_maternfivehalves, l);

void BayesianOptimizer::optimize(double current_time, double dt)
void BayesianOptimizer::init_first_controller()
{
if (current_time < evaluation_end_time) return;
assert(n_init_samples == 1 and "INIT SAMPLES > 1 not supported");

// init
if (samples.empty()) {
assert(n_init_samples == 1 and "INIT SAMPLES > 1 not supported");

// Save these weights
this->samples.push_back(this->vectorize_controller());
this->current_iteration = 0;
}
else // optimization step
{
params::acqui_ucb::set_alpha(this->acqui_ucb_alpha);
params::kernel_maternfivehalves::set_l(this->kernel_l);
params::kernel_maternfivehalves::set_sigma_sq(this->kernel_sigma_sq);

Eigen::VectorXd x;

// Specify bayesian optimizer. TODO: Make attribute and initialize at bo_init
limbo::bayes_opt::BOptimizer<params,
limbo::initfun<Init_t>,
limbo::modelfun<GP_t>,
limbo::acquifun<limbo::acqui::UCB<BayesianOptimizer::params, GP_t >>> boptimizer;

// Optimize. Pass dummy evaluation function and observations .
boptimizer.optimize(BayesianOptimizer::evaluation_function(this->samples[0].size()),
this->samples,
this->observations);
x = boptimizer.last_sample();
this->samples.push_back(x);

this->save_fitness();
}

// wait for next evaluation
this->evaluation_end_time = current_time + evaluation_time;
// Reset Evaluator
this->evaluator->reset();
// Save the initial weights
this->samples.push_back(this->vectorize_controller());
}

/**
* Function that obtains the current fitness by calling the evaluator and stores it
*/
void BayesianOptimizer::save_fitness()
void BayesianOptimizer::init_next_controller()
{
// Get fitness
double fitness = this->evaluator->fitness();
//TODO are these global variables 😱?
params::acqui_ucb::set_alpha(this->acqui_ucb_alpha);
params::kernel_maternfivehalves::set_l(this->kernel_l);
params::kernel_maternfivehalves::set_sigma_sq(this->kernel_sigma_sq);

// Specify bayesian optimizer. TODO: Make attribute and initialize at bo_init
limbo::bayes_opt::BOptimizer<params,
limbo::initfun<Init_t>,
limbo::modelfun<GP_t>,
limbo::acquifun<limbo::acqui::UCB<BayesianOptimizer::params, GP_t >>> boptimizer;

// Optimize. Pass evaluation function and observations .
boptimizer.optimize(BayesianOptimizer::evaluation_function(this->samples[0].size()),
this->samples,
this->observations);

Eigen::VectorXd x = boptimizer.last_sample();
this->samples.push_back(x);

// load into controller
this->devectorize_controller(x);
}

void BayesianOptimizer::finalize_current_controller(double fitness)
{
// Save connection_weights if it is the best seen so far
if(fitness > this->best_fitness)
{
31 changes: 9 additions & 22 deletions cpprevolve/revolve/brains/learner/BayesianOptimizer.h
Original file line number Diff line number Diff line change
@@ -2,9 +2,9 @@
// Created by matteo on 14/06/19.
//

#ifndef REVOLVE_BAYESIANOPTIMIZER_H
#define REVOLVE_BAYESIANOPTIMIZER_H
#pragma once

#include <limits>
#include "Learner.h"
#include "../controller/Controller.h"
#include "../controller/DifferentialCPG.h"
@@ -19,21 +19,18 @@ class BayesianOptimizer : public Learner
Evaluator *evaluator,
EvaluationReporter *reporter,
double evaluation_time,
size_t n_learning_evalutions);
unsigned int n_learning_evalutions);

/// \brief Destructor
~BayesianOptimizer() = default;

/// \brief performes the optimization of the controller. Used as a proxy to call the right optimization method
void optimize(double time, double dt) override;
void init_first_controller() override;
void init_next_controller() override;
void finalize_current_controller(double fitness) override;

Controller *controller() override
{ return this->_controller.get(); }

/// \brief bookeeping of the fitnensses
void save_fitness();


public:

/// \brief parameters for optimization
@@ -64,8 +61,6 @@ class BayesianOptimizer : public Learner

protected:
std::unique_ptr<::revolve::Controller> _controller;
const double evaluation_time;
double evaluation_end_time;

// BO Learner parameters
double kernel_noise;
@@ -80,33 +75,25 @@ class BayesianOptimizer : public Learner
/// \brief Specifies the acquisition function used
std::string acquisition_function;

/// \brief Max number of iterations learning is allowed
size_t n_learning_iterations;

/// \brief Number of initial samples
size_t n_init_samples;

/// \brief All samples seen so far.
std::vector <Eigen::VectorXd> samples;

/// \brief BO attributes
size_t current_iteration = 0;
/// \brief All fitnesses seen so far. Called observations in limbo context
std::vector< Eigen::VectorXd > observations;

/// \brief function to turn the controller into a sample
std::function<Eigen::VectorXd()> vectorize_controller;

/// \brief function to turn a sample into a controller
std::function<void(Eigen::VectorXd)> devectorize_controller;

/// \brief All fitnesses seen so far. Called observations in limbo context
std::vector< Eigen::VectorXd > observations;

/// \brief Best fitness seen so far
double best_fitness = -10.0;
double best_fitness = -std::numeric_limits<double>::infinity();

/// \brief Sample corresponding to best fitness
Eigen::VectorXd best_sample;
};
}

#endif //REVOLVE_BAYESIANOPTIMIZER_H
36 changes: 11 additions & 25 deletions cpprevolve/revolve/brains/learner/HyperNEAT.cpp
Original file line number Diff line number Diff line change
@@ -14,14 +14,10 @@ HyperNEAT::HyperNEAT(
const int seed,
const double evaluation_time,
unsigned int n_evaluations)
: Learner(evaluator, reporter)
, evaluation_time(evaluation_time)
, end_controller_time(-1)
: Learner(evaluator, reporter, evaluation_time, n_evaluations)
, _controller(std::move(controller))
, params(params)
, population(nullptr)
, evaluation_counter(0)
, n_evaluations(n_evaluations)
{
NEAT::Genome start_genome(0, 3, 0, 1, //TODO these are also parameters
false,
@@ -38,30 +34,19 @@ HyperNEAT::HyperNEAT(
1.0,
seed
));
}

void HyperNEAT::init_first_controller()
{
current_specie_evaluating = population->m_Species.begin();
current_genome_evaluating = current_specie_evaluating->m_Individuals.begin();

//TODO load genome in controller
}

void HyperNEAT::optimize(const double time, const double dt)
void HyperNEAT::init_next_controller()
{
if (end_controller_time < 0) {
end_controller_time = time;
return;
}
if (end_controller_time < time) return;

evaluation_counter++;
double fitness = evaluator->fitness();
//TODO check if you finished the budget of generations
bool finished = evaluation_counter >= n_evaluations;

evaluation_reporter->report(evaluation_counter, finished, fitness);
current_genome_evaluating->SetFitness(fitness);

// load next genome
if (finished) return;

current_genome_evaluating++;

// Finished a species
@@ -79,9 +64,10 @@ void HyperNEAT::optimize(const double time, const double dt)
current_genome_evaluating = current_specie_evaluating->m_Individuals.begin();
}

// You have a valid genome
//TODO load genome in controller
}

evaluator->reset();
end_controller_time = time + evaluation_time;
void HyperNEAT::finalize_current_controller(double fitness)
{
current_genome_evaluating->SetFitness(fitness);
}
10 changes: 4 additions & 6 deletions cpprevolve/revolve/brains/learner/HyperNEAT.h
Original file line number Diff line number Diff line change
@@ -23,22 +23,20 @@ class HyperNEAT: public Learner
double evaluation_time,
unsigned int n_evaluations);

virtual ~HyperNEAT() = default;
~HyperNEAT() override = default;

Controller *controller() override
{ return _controller.get(); }

void optimize(double time, double dt) override;
void init_first_controller() override;
void init_next_controller() override;
void finalize_current_controller(double fitness) override;

private:
double evaluation_time;
double end_controller_time;
std::unique_ptr<Controller> _controller;

const NEAT::Parameters params;
std::unique_ptr<NEAT::Population> population;
unsigned int evaluation_counter;
unsigned int n_evaluations;
std::vector<NEAT::Species>::iterator current_specie_evaluating;
std::vector<NEAT::Genome>::iterator current_genome_evaluating;
};
37 changes: 37 additions & 0 deletions cpprevolve/revolve/brains/learner/Learner.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
//
// Created by matteo on 12/6/19.
//

#include "Learner.h"

using namespace revolve;

void Learner::optimize(double time, double /*dt*/)
{
if (time > end_controller_time) return;

// first evaluation
if (evaluation_counter < 0)
{
evaluation_counter = 0;
this->init_first_controller();
}
else
{
// finalize previous run
evaluation_counter++;
double fitness = evaluator->fitness();
bool finished = evaluation_counter >= n_evaluations;

evaluation_reporter->report(evaluation_counter, finished, fitness);
this->finalize_current_controller(fitness);

if (finished) return;

// load next genome
this->init_next_controller();
}

evaluator->reset();
end_controller_time = time + evaluation_time;
}
30 changes: 25 additions & 5 deletions cpprevolve/revolve/brains/learner/Learner.h
Original file line number Diff line number Diff line change
@@ -4,30 +4,50 @@

#pragma once

#include "../controller/Controller.h"
#include <limits>
#include "Evaluator.h"
#include "EvaluationReporter.h"
#include "../controller/Controller.h"

namespace revolve {

class Learner
{
public:
/// \brief Constructor
explicit Learner(Evaluator *evaluator, EvaluationReporter *reporter)
: evaluator(evaluator)
explicit Learner(
Evaluator *const evaluator,
EvaluationReporter *const reporter,
const double evaluation_time,
const unsigned int n_evaluations)
: evaluation_time(evaluation_time)
, end_controller_time(std::numeric_limits<double>::infinity())
, evaluation_counter(-1)
, n_evaluations(n_evaluations)
, evaluator(evaluator)
, evaluation_reporter(reporter)
{}
{}

/// \brief Deconstructor
virtual ~Learner() = default;

/// \brief performes the optimization of the controller
virtual void optimize(double time, double dt) = 0;
virtual void optimize(double time, double dt);
virtual void init_first_controller() = 0;
virtual void init_next_controller() = 0;
virtual void finalize_current_controller(double fitness) = 0;

virtual revolve::Controller *controller() = 0;

protected:
const double evaluation_time;
double end_controller_time;

/// \brief Learning iterations counter
int evaluation_counter;
/// \brief Max number of learning iterations
const unsigned int n_evaluations;

revolve::Evaluator *evaluator;
revolve::EvaluationReporter *evaluation_reporter;
};
12 changes: 7 additions & 5 deletions cpprevolve/revolve/brains/learner/NoLearner.h
Original file line number Diff line number Diff line change
@@ -14,19 +14,21 @@ class NoLearner : public Learner
{
public:
explicit NoLearner(std::unique_ptr<Controller> controller)
: Learner(nullptr, nullptr) //TODO add report
: Learner(nullptr, nullptr, 0, 0) //TODO add report
, _controller(std::move(controller))
{}

// This is inspired from the GNU `std::make_unique` source code
template<typename... _Args>
NoLearner(_Args &&... args)
: Learner(nullptr, nullptr) //TODO add report
explicit NoLearner(_Args &&... args)
: Learner(nullptr, nullptr, 0, 0) //TODO add report
, _controller(new ControllerType(std::forward<_Args>(args)...))
{}

void optimize(double time, double dt) override
{}
void optimize(double /*time*/, double /*dt*/) override {}
void init_first_controller() override {}
void init_next_controller() override {}
void finalize_current_controller(double /*fitness*/) override {}

Controller *controller() override
{ return this->_controller.get(); }