Publications

Below you will find a list of articles, which is also available in my Google Scholar profile.

2023

  1. Value-Distributional Model-Based Reinforcement Learning
    Carlos E Luis, Alessandro G Bottero, Julia Vinogradska, Felix Berkenkamp, Jan Peters
    arXiv preprint arXiv:2308.06590.

    Quantifying uncertainty about a policy’s long-term performance is important to solve sequential decision-making tasks. We study the problem from a model-based Bayesian reinforcement learning perspective, where the goal is to learn the posterior distribution over value functions induced by parameter (epistemic) uncertainty of the Markov decision process. Previous work restricts the analysis to a few moments of the distribution over values or imposes a particular distribution shape, e.g., Gaussians. Inspired by distributional reinforcement learning, we introduce a Bellman operator whose fixed-point is the value distribution function. Based on our theory, we propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function that can be used for policy optimization. Evaluation across several continuous-control tasks shows performance benefits with respect to established model-based and model-free algorithms.
    Close
    @article{luis2023value,
      title = {Value-Distributional Model-Based Reinforcement Learning},
      author = {Luis, Carlos E and Bottero, Alessandro G and Vinogradska, Julia and Berkenkamp, Felix and Peters, Jan},
      journal = {arXiv preprint arXiv:2308.06590},
      urllink = {https://arxiv.org/pdf/2308.06590.pdf},
      year = {2023}
    }
    
  2. Model-Based Uncertainty in Value Functions
    Carlos E. Luis, Alessandro G. Bottero, Julia Vinogradska, Felix Berkenkamp, Jan Peters
    in Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, vol. 206.

    We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. In particular, we focus on characterizing the variance over values induced by a distribution over MDPs. Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation, but the over-approximation may result in inefficient exploration. We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values and explicitly characterizes the gap in previous work. Moreover, our uncertainty quantification technique is easily integrated into common exploration strategies and scales naturally beyond the tabular setting by using standard deep reinforcement learning architectures. Experiments in difficult exploration tasks, both in tabular and continuous control settings, show that our sharper uncertainty estimates improve sample-efficiency.
    Close
    @inproceedings{pmlr-v206-luis23a,
      title = {Model-Based Uncertainty in Value Functions},
      author = {Luis, Carlos E. and Bottero, Alessandro G. and Vinogradska, Julia and Berkenkamp, Felix and Peters, Jan},
      booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
      pages = {8029--8052},
      year = {2023},
      editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
      volume = {206},
      series = {Proceedings of Machine Learning Research},
      month = {25--27 Apr},
      publisher = {PMLR},
      urllink = {https://arxiv.org/pdf/2302.12526.pdf},
      urlvideo = {https://youtu.be/ZMlF1w1H4T8?si=hP1HAoymhHxETlrq},
      code = {https://github.com/boschresearch/ube-mbrl}
    }
    

2022

  1. Information-Theoretic Safe Exploration with Gaussian Processes
    Alessandro Giacomo Bottero, Carlos E. Luis, Julia Vinogradska, Felix Berkenkamp, Jan Peters
    in Advances in Neural Information Processing Systems, vol. 35.

    We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an a priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown constraint and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. Our approach is naturally applicable to continuous domains and does not require additional hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we explore by learning about the constraint up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.
    Close
    @inproceedings{bottero2022informationtheoretic,
      title = {Information-Theoretic Safe Exploration with Gaussian Processes},
      author = {Bottero, Alessandro Giacomo and Luis, Carlos E. and Vinogradska, Julia and Berkenkamp, Felix and Peters, Jan},
      booktitle = {Advances in Neural Information Processing Systems},
      year = {2022},
      volume = {35},
      pages = {30707--30719},
      urllink = {https://arxiv.org/pdf/2308.06590.pdf},
      code = {https://github.com/boschresearch/information-theoretic-safe-exploration}
    }
    

2020

  1. Online Trajectory Generation with Distributed Model Predictive Control for Multi-Robot Motion Planning
    Carlos E. Luis, Marijan Vukosavljev, Angela P. Schoellig
    IEEE Robotics and Automation Letters, vol. 5, no. 2.

    We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. An event-triggered replanning strategy is proposed to account for disturbances. Our simulation results show that the proposed collision avoidance method can reduce, on average, around 50% of the travel time required to complete a multi-agent point-to-point transition when compared to the well-studied Buffered Voronoi Cells (BVC) approach. Additionally, it shows a higher success rate in transition tasks with a high density of agents, with more than 90% success rate with 30 palm-sized quadrotor agents in a 18 m^3 arena. The approach was experimentally validated with a swarm of up to 20 drones flying in close proximity.
    Close
    @article{luis-ral20,
      title = {Online Trajectory Generation with Distributed Model Predictive Control for Multi-Robot Motion Planning},
      author = {Luis, Carlos E. and Vukosavljev, Marijan and Schoellig, Angela P.},
      journal = {IEEE Robotics and Automation Letters},
      year = {2020},
      volume = {5},
      number = {2},
      pages = {604--611},
      doi = {10.1109/LRA.2020.2964159},
      urlvideo = {https://www.youtube.com/watch?v=N4rWiraIU2k},
      urllink = {https://arxiv.org/pdf/1909.05150.pdf},
      code = {https://github.com/carlosluis/online_dmpc}
    }
    

2019

  1. Trajectory Generation for Multiagent Point-To-Point Transitions via Distributed Model Predictive Control
    Carlos E. Luis, Angela P. Schoellig
    IEEE Robotics and Automation Letters, vol. 4, no. 2.

    This paper introduces a novel algorithm for multiagent offline trajectory generation based on distributed model predictive control (DMPC). By predicting future states and sharing this information with their neighbours, the agents are able to detect and avoid collisions while moving towards their goals. The proposed algorithm computes transition trajectories for dozens of vehicles in a few seconds. It reduces the computation time by more than 85% compared to previous optimization approaches based on sequential convex programming (SCP), with only causing a small impact on the optimality of the plans. We replaced the previous compatibility constraints in DMPC, which limit the motion of the agents in order to avoid collisions, by relaxing the collision constraints and enforcing them only when required. The approach was validated both through extensive simulations for a wide range of randomly generated transitions and with teams of up to 25 quadrotors flying in confined indoor spaces.
    Close
    @article{luis-ral19,
      title = {Trajectory Generation for Multiagent Point-To-Point Transitions via Distributed Model Predictive Control},
      author = {Luis, Carlos E. and Schoellig, Angela P.},
      journal = {IEEE Robotics and Automation Letters},
      year = {2019},
      volume = {4},
      number = {2},
      pages = {357--382},
      urllink = {https://arxiv.org/abs/1809.04230},
      urlvideo = {https://youtu.be/ZN2e7h-kkpw},
      code = {https://github.com/carlosluis/multiagent_planning}
    }
    
  2. Fast and in sync: periodic swarm patterns for quadrotors
    Xintong Du, Carlos E. Luis, Marijan Vukosavljev, Angela P. Schoellig
    in Proc. of the IEEE International Conference on Robotics and Automation (ICRA).

    This paper aims to design quadrotor swarm performances, where the swarm acts as an integrated, coordinated unit embodying moving and deforming objects. We divide the task of creating a choreography into three basic steps: designing swarm motion primitives, transitioning between those movements, and synchronizing the motion of the drones. The result is a flexible framework for designing choreographies comprised of a wide variety of motions. The motion primitives can be intuitively designed using few parameters, providing a rich library for choreography design. Moreover, we combine and adapt existing goal assignment and trajectory generation algorithms to maximize the smoothness of the transitions between motion primitives. Finally, we propose a correction algorithm to compensate for motion delays and synchronize the motion of the drones to a desired periodic motion pattern. The proposed methodology was validated experimentally by generating and executing choreographies on a swarm of 25 quadrotors.
    Close
    @inproceedings{du-icra19,
      author = {Du, Xintong and Luis, Carlos E. and Vukosavljev, Marijan and Schoellig, Angela P.},
      title = {Fast and in sync: periodic swarm patterns for quadrotors},
      booktitle = ,
      year = {2019},
      pages = {9143--9149},
      urllink = {https://arxiv.org/abs/1810.03572},
      urlvideo = {https://youtu.be/Iw8mwt3l0RE}
    }
    
  3. Towards Scalable Online Trajectory Generation for Multi-robot Systems
    Carlos E. Luis, Marijan Vukosavljev, Angela P. Schoellig
    Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA).

    We present a distributed model predictive control (DMPC) algorithm to generate trajectories in real-time for multiple robots, taking into account their trajectory tracking dynamics and actuation limits. An event-triggered replanning strategy is proposed to account for disturbances in the system. We adopted the on-demand collision avoidance method presented in previous work to efficiently compute non-colliding trajectories in transition tasks. Preliminary results in simulation show a higher success rate than previous online methods based on Buffered Voronoi Cells (BVC), while maintaining computational tractability for real-time operation.
    Close
    @misc{luis-icra19,
      author = {Luis, Carlos E. and Vukosavljev, Marijan and Schoellig, Angela P.},
      title = {Towards Scalable Online Trajectory Generation for Multi-robot Systems},
      year = {2019},
      howpublished = {Abstract and Poster, in Proc. of the Resilient Robot Teams Workshop at the IEEE International Conference on Robotics and Automation (ICRA)}
    }
    

2016

  1. Design of a trajectory tracking controller for a nanoquadcopter
    Carlos E. Luis, Jerome Le Ny
    Technical Report, Ecole Polytechnique de Montreal.

    The primary purpose of this study is to investigate the system modeling of a nanoquadcopter as well as designing position and trajectory control algorithms, with the ultimate goal of testing the system both in simulation and on a real platform. The open source nanoquadcopter platform named Crazyflie 2.0 was chosen for the project. The first phase consisted in the development of a mathematical model that describes the dynamics of the quadcopter. Secondly, a simulation environment was created to design two different control architectures: cascaded PID position tracker and LQT trajectory tracker. Finally, the implementation phase consisted in testing the controllers on the chosen platform and comparing their performance in trajectory tracking.
    Close
    @misc{luis-polytechnique16,
      author = {Luis, Carlos E. and Ny, Jerome Le},
      title = {Design of a trajectory tracking controller for a nanoquadcopter},
      year = {2016},
      urllink = {https://arxiv.org/pdf/1608.05786.pdf},
      urlvideo = {https://youtu.be/c-SXovCyhJQ},
      howpublished = {Technical Report, Ecole Polytechnique de Montreal}
    }