About me
I am a Ph.D. student specializing in Computer Vision and Deep Learning at the RoPeRT research group, University of Zaragoza, Spain. During my studies, I have been a visiting researcher at both Google DeepMind (Mountain View, USA) and the UCSD (San Diego, USA). My academic journey began with a B.Sc. in Mathematics, followed by an M.Sc. in Robotics, Graphics, and Computer Vision, both of which I completed with honors.
Ph.D. Student in Computer Vision and Deep Learning
I am a Ph.D. candidate in Computer Vision and Deep Learning. My current research is centered on building multimodal agents for a variety of computer vision tasks. This work leverages video/image understanding, including action recognition, video question answering, and model self-improvement. Previously, I explored the application of Bayesian deep learning and the transfer of state-of-the-art models to robotics.
Main topics of interest
-
Multimodal Agents
Can agents improve current LLMs performance on challenging vision tasks?
-
Video Understanding
Video Question Answering, Temporal Grounding and Action Recognition.
Other topics of interest
-
Bayesian Deep Learning
Can BDL techniques increase the performance and calibration of current DL models?
-
Robotics
How can we leverage sota deep learning models for robotics?
-
Event cameras
Are event cameras particularly well-suited for low-light scenarios?
-
Generative models
Can we leverage generative models to drone shows?