The aim of my work is to bridge the gap between theoretical knowledge and its application to solve real-world decision making problems, particularly in the space of sequential decision making where actions affect future situations and outcomes. My research topic is in the area of sequential decision making with reinforcement learning methods, which can contribute significantly to the space.
For my research, I have been investigating challenges of learning with delayed feedback, such as the scarcity of feedback signal or long waiting times to observe feedback. I work on measuring the impact of these situations on learning agents, and developing methods to overcome those challenges. In industry, I work on developing solutions to common problems that arise in recommender systems. These problems range of learning from biased data, limited exploration budget, learning under uncertainty, and dealing with large action spaces.
Other areas of interest to me are distributed systems, multi-agents problems, and representation learning.
Contact: guilherme@dsv.su.se
Publications
Dinis Junior, Guilherme, Sindri Magnússon, and Jaakko Hollmén. “Policy Control with Delayed, Aggregated Anonymous Feedback..” In Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2024
Baran, Buket, Guilherme Dinis Junior, Antonina Danylenko, Olayinka S. Folorunso, Gösta Forsum, Maksym Lefarov, Lucas Maystre, and Yu Zhao. “Accelerating Creator Audience Building through Centralized Exploration.” In Proceedings of the 17th ACM Conference on Recommender Systems, 70–73. RecSys ’23. New York, NY, USA: Association for Computing Machinery, 2023. https://doi.org/10.1145/3604915.3608880.
Movin, Maria, Guilherme Dinis Junior, Jaakko Hollmén, and Panagiotis Papapetrou. “Explaining Black Box Reinforcement Learning Agents Through Counterfactual Policies.” In Advances in Intelligent Data Analysis XXI, edited by Bruno Crémilleux, Sibylle Hess, and Siegfried Nijssen, 314–26. Cham: Springer Nature Switzerland, 2023. https://doi.org/10.1007/978-3-031-30047-9_25.
Dinis Junior, Guilherme, Sindri Magnússon, and Jaakko Hollmén. “Policy Evaluation with Delayed, Aggregated Anonymous Feedback.” In Discovery Science, edited by Poncelet Pascal and Dino Ienco, 114–23. Lecture Notes in Computer Science. Cham: Springer Nature Switzerland, 2022. https://doi.org/10.1007/978-3-031-18840-4_9.
Dinis, Guilherme, Nordin Zakaria, and Ken Naono. “Pluggable Scheduling on an Open-Source Based Volunteer Computing Infrastructure.” In 2014 International Conference on Computer and Information Sciences (ICCOINS), 1–7, 2014. https://doi.org/10.1109/ICCOINS.2014.6868823.