About me
I am a Ph.D. student specializing in Computer Vision and Deep Learning at the RoPeRT research group, University of Zaragoza, Spain. My academic journey began with a B.Sc. in Mathematics, followed by an M.Sc. in Robotics, Graphics, and Computer Vision, both of which I completed with honors.
Ph.D. Student in Computer Vision and Deep Learning
I am finishing the second year of my Ph.D. thesis, focusing on various tasks in video understanding, such as action recognition, temporal grounding, and video question answering in challenging scenarios like extremely long-duration videos, event-based data, and egocentric perspectives. I am also interested in applying Bayesian deep learning to computer vision tasks and transferring sota computer vision and deep learning models to robotics.
Video understanding
-
Temporal grounding
Can we temporally localize actions in egocentric videos?
-
Action recognition
Can we recognize actions in low-light videos?
-
Visual Question Answering
Can we answer fine-grained questions about long videos?
-
VLMs
How can we adapt vision-language models to handle long videos?
Other topics of interest
-
Bayesian Deep Learning
Can BDL techniques increase the performance and calibration of current DL models?
-
Robotics
How can we leverage sota deep learning models for robotics?
-
Event cameras
Are event cameras particularly well-suited for low-light scenarios?
-
Generative models
Can we leverage generative models to drone shows?