About me
Hi! I am Yiran HU (胡伊然). My name derives from an old Chinese Poem. 蒹葭苍苍,白露为霜。所谓伊人,在水一方。(Green, green the reed. Dew and frost gleam. Where’s she I need. Beyond the stream.)
I am now a third year master student at Tsinghua University. I am fortunate to be supervised by Prof. Weixing Shen and Prof. Yiqun Liu. Additionally, I serve as a research assistant at the University of Hong Kong, working under the guidance of Prof. Ben Kao. I’m now a member of CS & Law program in Tsinghua University. Pursuing courses in both the Department of Computer Science and the School of Law, and collaborating with students from diverse backgrounds on research, has made my graduate life truly fascinating.
In the past two years, my research has been focused on Legal LLMs and AI ethics. Looking ahead, I aspire to delve into more intriguing research areas within law and technology. I am particularly fascinated by the integration of domain knowledge in language models and also passionate about exploring the robustness and safety of AI.
Education
- 2021-present, Master student, Computational Law, Tsinghua University.
Tsinghua University Outstanding Graduate.(About 1%)
Tsinghua University Scholarship. - 2017-2021, B.E., Computer Science and Technology, Beijing Foreign Studies University.
Graduate as the Top 1 student in my grade.
Beijing Outstanding Graduate.
Three times First-Class Scholarship(2018, 2019, 2020).
Double major in Law and Diplomacy.
Publications (Selected)
- LEEC for Judicial Fairness: A Legal Element Extraction Dataset with Extensive Extra-Legal Labels
International Joint Conference on Artificial Intelligence (IJCAI 2024).
[paper] [code] - MUSER: A Multi-View Similar Case Retrieval Dataset.
CIKM2023 Best Resource Paper Honorable Mention
32nd ACM International Conference on Information and Knowledge Management. (CIKM 2023).
[paper] [code] - STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals. (EMNLP 2024 Findings).
[paper] - Leveraging Event Schema to Ask Clarifying Questions for Conversational Legal Case Retrieval.
32nd ACM International Conference on Information and Knowledge Management. (CIKM 2023 Full Paper).
[paper] - Investigating the Conversational Agent Action in Legal Case Retrieval.
The 45th European Conference on Information Retrieval. (ECIR 2023 Full Paper).
[paper] [code] - LEEC: A Legal Element Extraction Dataset with an Extensive Domain-Specific Label System
32nd ACM International Conference on Information and Knowledge Management Workshop. (MLLD 2023).
[paper]
Projects
- Legal LLM, Group Member
We have built two legal LLMs: one is a RAG-based legal LLM(Have a try!), and the other is a fine-tuned legal LLM based on legal corpora. Based on these models, we have conducted research on various downstream tasks of legal LLMs, such as hallucination issues, evaluation problems, and multi-turn dialogue problems. Additionally, we are currently investigating the safety issues of Legal LLMs. Our goal is to construct reliable and interpretable domain-specific LLMs. - National Ministry of Science and Technology key research and development project “Private Lending Intelligent Trial Technology”, Group Member
Construct a fact determination system characterized by case labels, a judgment rule system with the focus of disputes as the core, and an automatic recognition algorithm supported by pretrained models. The results of this project include two core business segments: an intelligent trial assistance platform and a similar case retrieval platform; It also includes two major adjudication auxiliary tools: an adjudication rule base and a complex debt interest calculator.Responsible for the training the similiar case retrieval algorithm and developing the retrieval platform with the case label and the focus of controversy as the core. The similiar case retrieval dataset has been accepted in CIKM2023 resource track. - Conversational Legal Case Retrieval, Group Member
The goal of this project is to construct a conversational legal case retrieval system. By conducting multi-round dialogues between users and experts to identify user needs, the system will perform similar case retrieval tasks based on the user’s complete requirements. This project is divided into three parts: user behavior analysis, dataset construction, and pre-training model construction. The user behavior analysis paper has been accepted in ECIR2023, the clarifying question generation model paper has been accepted in CIKM2023. - The Trusted Legal Artificial Intelligence, Group Member
Using counterfactual methods to explore the robustness of judicial pre-training models in intelligent legal trials. Attacks were carried out on models such as Legal-Bert and Lawformer to determine the basis for the models’ judgments and to propose methods to improve the robustness of the models. The project is under study and is expected to be completed in spring 2024. - Work Injury Compensation Tool Series, Group Leader
The work injury compensation tool series includes a calculator of work injury compensation (Web version, WeiXin mini program version) and a case retrieval platform for work injury compensation. The case retrieval platform is mainly aimed at searching the typical cases of work injuries. As the person in charge of the “Work Injury Compensation Calculator” tool series, I led a team to sort out the work injury insurance regulations and local administrative regulations, and organized the legal provisions into calculation logic, which eventually formed both a web and mini program version. The case retrieval platform mainly focuses on typical work injury cases. We sorted out the labels and annotated the cases, and conducted case searches through label-based retrieval. All calculators and retrieval platforms are now online. [Access link] - The Legal Aid Platform Construction, Group Leader
Carry out the program deployment design and requirement document writing for the legal aid platform. The functions of the legal aid platform are divided into four modules: intelligent consultation and service, intelligent distribution, traceability governance, and knowledge community. The initial draft of the requirement document has been completed, and the first phase of the system is expected to be launched in July 2023.
Experiences
- 2023-present, Research Assistant, The University of Hong Kong.
- 2022-2023, Deputy Dean of Dance Troupe, Tsinghua University.
- 2021-2021, Student of Continuing Education College of Beijing Dance Academy.
- 2019-2020, Chair of Student Union, School of Information Science and Technology, Beijing Foreign Studies University.
- 2019-2019, Summer Workshop in National University of Singapre.
- 2017-2019, Grade Monitor of 2017, Beijing Foreign Studies University.