Bail Prediction (BAIL)

BAIL requires to predict whether the accused should be granted bail or not, given a case document (including the facts).

Type of Task Binary Text Classification
Dataset HLDC (Kapoor et al., 2022)
Language Hindi
No. of documents 900k
Evaluation Metric macro-F1

Task Motivation and Description

Majority of the pending cases in the Indian legal system are from the district-level courts. Moreover, among the pending cases in district courts, a significant chunk of cases have to do with bail application (Kapoor et al., 2022). Many of the district courts in India use regional language as the official language. Since Hindi is the most spoken language in India and the majority of the courts in northern India use Hindi, in this task, we focus on the task of Bail Prediction for Hindi legal documents.

Formally, given a legal document (having the facts of the case), the task of Bail Prediction involves predicting if the accused should be granted bail or not (i.e., a binary decision of 0 and 1).

Dataset

For this task, we use Hindi Legal Document Corpus (HLDC) created by us in our previous work Kapoor et al., (2022). HLDC is a corpus of 900k Hindi legal documents from district courts of a north Indian state. HLDC corpus creation process involves various pre-processing steps to take care of possible ethical consequences that may creep in due to different types of biases. Bail documents are annotated with ground truth bail decisions.

Dataset Format

Each document (json) has the following format:

Dict{
  'id': string  // case identifier
  'district': string  // district of origin
  'text': Dict{
      'facts-and-arguments': List(string) // fact sentences
      'judge-opinion': List(string) // judge opinion sentences
    }
  'label': ClassLabel // GRANTED/DENIED decision
}

Task Evaluation

Since the Bail prediction task is essentially a binary prediction task, it is evaluated using the standard macro-F1 score metric.

Baseline Models

The baseline model for the task of Bail Prediction is a Multi-Task Learning-based model. The main task is the binary bail prediction, and the auxiliary task is predicting the salient sentences in the bail document. The base model used is IndicBERT since the documents are in Hindi. The model for Bail Prediction has an F1 score of 81%.

Two separate data splits are considered:

(i) all-districts: train, dev and test contain files from all districts

(ii) district-wise: dev and test contain files from districts not present in train

For more details, please refer to our paper Kapoor et al. (2022).

Results

Model District-wise m-F1 All Districts m-F1
Doc2Vec + SVM 0.69 0.77
Doc2Vec + XGBoost 0.59 0.57
IndicBERT - first 512 0.62 0.71
IndicBERT - last 512 0.60 0.76
TF-IDF + IndicBERT 0.74 0.81
TextRank + IndicBERT 0.74 0.81
Salience Pred. + IndicBERT 0.74 0.78
Multi-Task 0.77 0.78