Publications
2026
- PREPRINTDisentangling Causal Importance from Emergent Structure in Multi-Expert OrchestrationSudipto Ghosh, Sujoy Nath, Sunny Manchanda, and Tanmoy ChakrabortyarXiv preprint arXiv:2602.04291, 2026
Multi-expert systems, where multiple Large Language Models (LLMs) collaborate to solve complex tasks, are increasingly adopted for high-performance reasoning and generation. However, the orchestration policies governing expert interaction and sequencing remain largely opaque. We introduce INFORM, an interpretability analysis that treats orchestration as an explicit, analyzable computation, enabling the decoupling of expert interaction structure, execution order, and causal attribution. We use INFORM to evaluate an orchestrator on GSM8K, HumanEval, and MMLU using a homogeneous consortium of ten instruction-tuned experts drawn from LLaMA-3.1 8B, Qwen-3 8B, and DeepSeek-R1 8B, with controlled decoding-temperature variation, and a secondary heterogeneous consortium spanning 1B-7B parameter models. Across tasks, routing dominance is a poor proxy for functional necessity. We reveal a divergence between relational importance, captured by routing mass and interaction topology, and intrinsic importance, measured via gradient-based causal attribution: frequently selected experts often act as interaction hubs with limited causal influence, while sparsely routed experts can be structurally critical. Orchestration behaviors emerge asynchronously, with expert centralization preceding stable routing confidence and expert ordering remaining non-deterministic. Targeted ablations show that masking intrinsically important experts induces disproportionate collapse in interaction structure compared to masking frequent peers, confirming that INFORM exposes causal and structural dependencies beyond accuracy metrics alone.
@article{ghosh2026disentanglingcausalimportanceemergent, title = {Disentangling Causal Importance from Emergent Structure in Multi-Expert Orchestration}, author = {Ghosh, Sudipto and Nath, Sujoy and Manchanda, Sunny and Chakraborty, Tanmoy}, journal = {arXiv preprint arXiv:2602.04291}, year = {2026}, url = {https://arxiv.org/abs/2602.04291} }
2024
- WORKSHOPInLegalLLaMA: Indian Legal Knowledge Enhanced Large Language ModelIn Proceedings of the LKM Workshop at IJCAI, 2024
Large Language Models (LLM) are being increasingly used in many domains including legal and justice. General purpose models trained on web data are not performant enough on legal text analytics (LTA) tasks while fine tuning task specific models is expensive because of the annotation and compute costs. Pre-training domain or application specific models is increasingly popular. However pre-training LLMs in small domain corpora like Indian legal documents and judgements is challenging. We introduce our InLegalLLaMA model, along with the related training corpus, adapted for the Indian legal domain, that shows promise of improved performance on LTA tasks.
@inproceedings{ghosh2024inlegalllama, title = {InLegalLLaMA: Indian Legal Knowledge Enhanced Large Language Model}, author = {Ghosh, Sudipto and Verma, Devanshu and Ganesan, Balaji and Bindal, Purnima and Kumar, Vikas and Bhatnagar, Vasudha}, booktitle = {Proceedings of the LKM Workshop at IJCAI}, year = {2024}, issn = {16130073}, volume = {3818}, url = {https://ceur-ws.org/Vol-3818/paper3.pdf}, } - PREPRINTHuman Centered AI for Indian Legal Text AnalyticsarXiv preprint arXiv:2403.10944, 2024
Legal research is a crucial task in the practice of law. It requires intense human effort and intellectual prudence to research a legal case and prepare arguments. Recent boom in generative AI has not translated to proportionate rise in impactful legal applications, because of low trustworthiness and and the scarcity of specialized datasets for training Large Language Models (LLMs). This position paper explores the potential of LLMs within Legal Text Analytics (LTA), highlighting specific areas where the integration of human expertise can significantly enhance their performance to match that of experts. We introduce a novel dataset and describe a human centered, compound AI system that principally incorporates human inputs for performing LTA tasks with LLMs.
@article{ghosh2024human, title = {Human Centered AI for Indian Legal Text Analytics}, author = {Ghosh, Sudipto and Verma, Devanshu and Ganesan, Balaji and Bindal, Purnima and Kumar, Vikas and Bhatnagar, Vasudha}, journal = {arXiv preprint arXiv:2403.10944}, year = {2024}, url = {https://arxiv.org/abs/2403.10944} } - THESISIndian Legal Knowledge Enhanced LLMs for LTA TasksSudipto GhoshDepartment of Computer Sciece, University of Delhi, 2024
@mastersthesis{ghosh2022thesis, title = {Indian Legal Knowledge Enhanced LLMs for LTA Tasks}, author = {Ghosh, Sudipto}, school = {Department of Computer Sciece, University of Delhi}, year = {2024}, }
2022
- WORKSHOPConstructing a Knowledge Graph from Indian Legal Domain CorpusSarika Jain, Pooja Harde, Nandana Mihindukulasooriya, Sudipto Ghosh, Abhinav Dubey, and Ankush BishtIn Proceedings of the TEXT2KG Workshop at ESWC, 2022
While being an important pillar of human society, legal domain consists of large corpora of complex documents about different aspects such as laws or court judgements. In recent years, knowledge graphs have become a prominent solution to represent such complex information in semantically rich machine readable manner allowing access to other AI powered downstream applications. In this work, we aim to construct a reliable knowledge graph from Legal domain corpus that may be utilized by researchers and the application developers working in legal domain.The source dataset chosen is the Indian Legal Court Judgements and NyOn1 (Nyaya Ontology) has been utilized for conceptualization. A framework that consists of entity extraction, relation extraction, triple construction is used to convert the legal text into RDF triples. The knowledge graph thus built has been quantitatively evaluated over a small random sample with reasonable results.
@inproceedings{Jain2022, author = {Jain, Sarika and Harde, Pooja and Mihindukulasooriya, Nandana and Ghosh, Sudipto and Dubey, Abhinav and Bisht, Ankush}, issn = {16130073}, booktitle = {Proceedings of the TEXT2KG Workshop at ESWC}, title = {Constructing a Knowledge Graph from Indian Legal Domain Corpus}, volume = {3184}, year = {2022}, url = {https://ceur-ws.org/Vol-3184/TEXT2KG_Paper_6.pdf}, }