Spatio-temporal Sign Language Representation and Translation
2510.19413v1
cs.CL, cs.CV
2025-10-24
Авторы:
Yasser Hamidullah, Josef van Genabith, Cristina España-Bonet
Abstract
This paper describes the DFKI-MLT submission to the WMT-SLT 2022 sign
language translation (SLT) task from Swiss German Sign Language (video) into
German (text). State-of-the-art techniques for SLT use a generic seq2seq
architecture with customized input embeddings. Instead of word embeddings as
used in textual machine translation, SLT systems use features extracted from
video frames. Standard approaches often do not benefit from temporal features.
In our participation, we present a system that learns spatio-temporal feature
representations and translation in a single model, resulting in a real
end-to-end architecture expected to better generalize to new data sets. Our
best system achieved $5\pm1$ BLEU points on the development set, but the
performance on the test dropped to $0.11\pm0.06$ BLEU points.
Ссылки и действия
Дополнительные ресурсы: