iSign: A Benchmark for Indian Sign Language Processing

1IIT Kanpur, 2Max Planck Institute for Psycholinguistics 3ISLRTC 4Microsoft
iSign Thumbnail

iSign The proposed benchmark for Indian Sign Language Processing.

ISL example

An example showing the translation of the phrase “What, Where, How, and When" in Indian Sign Language. The text box length overlaps with the signs with a pause position in between.

Tasks in iSign

Table below shows Tasks currently available in the iSign benchmark.

Task Task Name Links to dataset files
Task-1 ISL-to-English Translation a) ISLVideo-to-English Translation
b) ISLPose-to-English Translation
Task-2 English-to-ISLPose Generation English Translation-to-ISLPose
Task-3 Word/Gloss Recognition (Isolated Sign Recognition) CISLR
Task-4 Word Presence Prediction Word-ExampleSentence-pairs
Task-5 Semantic Similarity Prediction Word-Description-pairs

Dataset Directory Format

The iSign dataset can be downloaded using the following Download Link. The dataset directory structure is as follows:

iSign-Benchmark
    ├── Data
        ├── ISL-videos.tar.gz                # ISL sentence/phrase videos
    ├── Extracted-Features
        ├── mediapipe_holistic_poses1.tar.gz
        ├── mediapipe_holistic_poses2.tar.gz
        .
        .
        ├── mediapipe_holistic_poses12.tar.gz
        ├── mediapipe_holistic_poses13.tar.gz
    ├── Sample-Data
        ├── dataset_sample
            ├── def-words                    # DEF word videos sample
            ├── generation                   # ISL generation videos
            ├── ground_truth_keypoints       # ISL keypoints
            ├── task-4-word-presence         # Task-4 (_w.mp4, _e*.mp4)
            ├── task-5-word-semantics        # Task-5 (_w.mp4, _d.mp4)
            ├── translation                  # Task-1, Task-2 (sample videos)
        ├── code.zip                         # Baseline Code 
        iSign_data.csv                       # Task-1, Task-2 ISL video uid with respective ENGLISH translations
        word-presence-dataset.csv            # Task-4 word-presence query, candidate pairs (uids)
        word-description-dataset.csv         # Task-5 word-semantics query, candidate pairs (uids)

iSign Task Samples

Task-1: ISL-to-English Translation

ISL-to-English is a standard task of machine translation, with ISL-video as input and corresponding English translations as predictions. Dataset Links: ISLVideo-to-English Translation b) ISLPose-to-English Translation

English Translation (reference): "However in India there is a huge demand for cheap phones."


Task-2: English-to-ISLPose Generation

Dataset links: English Translation-to-ISLPose
The sample for English-to-ISLPose Generation can be found below.

English Input: "has not recorded a single COVID infection till now."


Task-3: Word/Gloss Recognition (Isolated Sign Recognition)

We use the dataset provided by CISLR [EMNLP 2022]. The dataset can be accessed using this link .


Task-4: Word Presence Prediction

For this task we use query-candidate pairs of ISL videos, below is the example of ISL word video and ISL sentence video. The complete dataset for word-sentence pairs can be found here.


Word (query): "premises"


Example Sentence (candidate): "The company is moving to new premises next month."


Task-5: Semantic Similarity Prediction

For this task we use query-candidate pairs of ISL videos, below is the example of ISL word video and ISL video explaining the meaning of the respective word. The complete dataset for query-candidate pairs can be found here.


Word (query): "immediate"


Description (candidate): "happening or done without delay or very soon after something else"

Dataset Validation

To verify the reliability of the video-sentence/phrase ISL-English pairs present in the dataset, we took the help of certified ISL signers. Due to the limited availability of certified ISL signers, we could only use a small randomly selected sign-text pairs sample (593 pairs) for human translation and validation.

iSign sample human validation

The Table shows a sample of English translations present in the created dataset compared to sentences translated by ISL Signer. Blue and Red colored text highlight the difference between semi-automatically generated English sentences and gold sentences generated by the ISL instructor.

iSign sample human validation scores

The Table shows the Translation scores for a sample of 593 sentence pairs from the created dataset when compared to references translated by ISL Signer.

Ackowledgments

We would like to thank the Indian Sign Language Research And Training Center (ILSRTC) team for helping us validate the quality of the curated translation dataset. We would also like to extend our immense gratitude towards the ISLRTC members for providing us full support in sharing the ISL content creation process. Finally, we would like to express our gratitude towards ISL content creators on YouTube (ISH-News and DEAF ENABLED FOUNDATION (DEF) ), without their data this work would have been impossible.

BibTeX

@inproceedings{iSign-2024,
  title = "{iSign}: A Benchmark for Indian Sign Language Processing",
  author = "Joshi, Abhinav  and
    Mohanty, Romit  and
    Kanakanti, Mounika  and
    Mangla, Andesha  and
    Choudhary, Sudeep  and
    Barbate, Monali  and
    Modi, Ashutosh",
  booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
  month = aug,    
  year = "2024",
  address = "Bangkok, Thailand",
  publisher = "Association for Computational Linguistics", 
}