PENERAPAN METODE K-NEAREST NEIGHBOR (KNN) DALAM ANALISIS SENTIMEN ULASAN PENGGUNA APLIKASI SHOPEE PADA GOOGLE PLAY STORE

Penulis

  • Fauzia Win Salsabila Universitas Panca Sakti Bekasi
  • Tumini Universitas Panca Sakti Bekasi
  • Agmawarnida Universitas Panca Sakti Bekasi

Kata Kunci:

Analisis Sentimen, Shopee, K-Nearest Neighbor, TF-IDF, Google Play

Abstrak

Kemajuan teknologi telah mendorong pertumbuhan pengguna e-commerce, termasuk Shopee yang saat ini menjadi salah satu platform belanja online terbesar di Indonesia. Banyaknya ulasan pengguna pada aplikasi Shopee di Google Play Store menghasilkan kumpulan pendapat dalam jumlah besar. Pendapat-pendapat tersebut dapat menjadi bahan pertimbangan penting, baik bagi prusahaan dalam meningkatkan kualitas layanan maupun bagi masyarakat dalam mengambil keputusan berbelanja. Namun, banyaknya jumlah ulasan membuat proses analisis secara manual menjadi kurang efektif dan memakan waktu. Karena itu, penelitian ini menggunakan cara otomatis untuk mengetahui apakah komentar pengguna Shopee cenderung positif, negatif atau netral. Metode yang digunakan adalah algoritma K-Nearest Neighbor (KNN) dengan data analisis sebanyak 100.000 ulasan. Tahapan penelitian ini dilakukan melalui beberapa tahapan. Pertama, data ulasan pengguna Shopee diambil dari Kaggle. Kedua, dilakukan preprocessing teks yang meliputi pembersihan data, normalisasi, tokenisasi, penghapusan kata tidak penting (stopword removal), dan stemming. Selanjutnya, kata-kata diberi bobot menggunakan metode Term Frequency-Inverse Document Frequency (TF-IDF). Proses klasifikasi dilakukan dengan algoritma K-Nearest Neighbor (KNN) menggunakan beberapa variasi nilai k. Hasil penelitian menunjukkan bahwa sebagian besar ulasan bersifat positif 77,2%, sedangkan ulasan negatif mencapai 18,5% dan ulasan netral sebesar 4,3%. Model KNN konsisten menghasilkan akurasi sebesar 77% pada berbagai skenario pembagian data latih dan data uji. Performa terbaik ditunjukkan pada ulasan positif dengan nilai precision sebesar 0,79, recall 0,95-0,96 dan f1-score 0,87. Namun, performa pada ulasan negatif dan netral masih rendah karena adanya ketidak seimbangan jumlah data.Kesimpulan dari penelitian ini adalah metode K-Nearest Neighbor (KNN) cukup baik untuk mengenali ulasan positif, tetapi masih perlu perbaikan agar bisa lebih akurat membedakan ulasan negatif dan netral. Hasil ini bisa membantu pengembang aplikasi dalam menyempurnakan sistem analisis, serta menjadi bahan pertimbangan bagi pelaku bisnis e-commerce untuk meningkatkan layanan.

Technological advancements have driven the growth of e-commerce users, including Shopee, which is currently one of the largest online shopping platforms in Indonesia. The large number of user reviews on the Shopee application in the Google Play Store generates a vast collection of opinions. These opinions can serve as important considerations, both for companies in improving service quality and for the public in making purchasing decisions. However, the large volume of reviews makes manual analysis inefficient and time-consuming. Therefore, this study applies an automated approach to determine whether Shopee user comments tend to be positive, negative, or neutral. The method used is the K-Nearest Neighbor (KNN) algorithm with a dataset of 100,000 reviews. The research was conducted through several stages. First, Shopee user review data was obtained from Kaggle. Second, text preprocessing was carried out, including data cleaning, normalization, tokenization, stopword removal, and stemming. Next, words were weighted using the Term Frequency-Inverse Document Frequency (TF-IDF) method. The classification process was performed using the K-Nearest Neighbor (KNN) algorithm with several variations of the k value. The results of the study show that most of the reviews were positive (77.2%), while negative reviews accounted for 18.5% and neutral reviews 4.3%. The KNN model consistently achieved an accuracy of 77% across various scenarios of training and testing data splits. The best performance was shown in positive reviews with a precision of 0.79, recall of 0.95–0.96, and an f1-score of 0.87. However, the performance on negative and neutral reviews remained low due to data imbalance. The conclusion of this study is that the K-Nearest Neighbor (KNN) method is fairly effective in recognizing positive reviews, but further improvements are needed to achieve higher accuracy in distinguishing negative and neutral reviews. These findings can assist application developers in enhancing automated analysis systems and serve as a consideration for e-commerce businesses in improving service quality.

Unduhan

Diterbitkan

2025-10-30