IMPLEMENTASI PENCARIAN SEMANTIK DAN LARGE LANGUANGE MODELS DALAM PENERAPAN PENCARIAN HUKUM BERDASARKAN UNDANG-UNDANG

IRHAMNA, M. HAIKAL (2025) IMPLEMENTASI PENCARIAN SEMANTIK DAN LARGE LANGUANGE MODELS DALAM PENERAPAN PENCARIAN HUKUM BERDASARKAN UNDANG-UNDANG. Other thesis, Nusa Putra University.

[thumbnail of Skripsi] Text (Skripsi)
M. Haikal (Repository).pdf - Other

Download (1MB)

Abstract

Fast and accurate access to legal information is crucial for society; however, many conventional legal search systems have not yet leveraged the latest technological advancements. One potential solution is to apply semantic search technology to simplify the search for legal information based on laws. This research aims to implement semantic search in a legal information retrieval application using a WhatsApp ChatBot. The system employs the all-mpnet-base-v2 Transformer model, which maps sentences and paragraphs to a 768-dimensional vector space for clustering or semantic search. Additionally, it incorporates two comparison algorithms to measure the similarity between legal documents: Euclidean Distance and Dot Product. The data used is stored in a 768-dimensional vector form in the Qdrant database, obtained through an embedding process to generate numerical representations of legal texts. The application development method follows the Waterfall model, encompassing analysis, design, implementation, and testing phases. Testing is conducted using two approaches: Unit Testing to assess individual system functions and User Review to evaluate user experience by collecting feedback through Google Forms questionnaires. Based on the analysis of the F1-Score and threshold relationship, the Euclidean Distance method achieves an optimal F1-Score at a threshold of 0.8, whereas Dot Product achieves it at 0.58. The ROC-AUC evaluation indicates an AUC of 0.56 for Euclidean Distance (moderate performance) and 0.41 for Dot Product (less optimal performance). These comparisons demonstrate that Euclidean Distance is superior in measuring the similarity between legal documents. Testing the WhatsApp ChatBot application shows an accuracy of 90%, with a margin of error of 10%. Survey results from 51 respondents indicate a positive reception of the ChatBot: 73% of respondents find the ChatBot easy to use, 69.8% believe the ChatBot adequately understands legal terms, 71% rate the ChatBot's response speed as fast enough, and 73.8% consider the ChatBot's answers relevant. This research is expected to enhance the accuracy and efficiency of legal information retrieval, providing an innovative solution for technology-based legal information access.
Keywords: semantic search, Euclidean Distance, Dot Product, VenomBot, ChatBot, WhatsApp, vector database

Item Type: Thesis (Other)
Subjects: Computer > Informatic Engineering
Divisions: Faculty of Engineering, Computer and Design > Informatic Engineering
Depositing User: Unnamed user with email liu@nusaputra.ac.id
Date Deposited: 15 Apr 2025 07:46
Last Modified: 15 Apr 2025 07:46
URI: http://repository.nusaputra.ac.id/id/eprint/1439

Actions (login required)

View Item
View Item