AI-ML·중요도 6·2026. 05. 23.·r/MachineLearning

I fine-tuned an LLM to be C-3PO to test which training data format works best for persona injection [P]

── KO ──────────────────

C-3PO처럼 행동하도록 LLM을 미세 조정한 실험 결과를 다룹니다.

이 글에서는 LLM을 C-3PO처럼 동작하도록 미세 조정하는 실험을 통해 각각의 데이터 형식이 인물 주입에 미치는 영향을 테스트했습니다. 세 가지 형식, 즉 채팅 데모, 첫 번째 인칭 진술, 그리고 합성 위키 스타일 문서가 비교되었습니다. 예상과는 달리 첫 번째 인칭 진술이 일반화에 가장 좋은 것으로 나타났습니다. 또한, 합성 문서 형식은 C-3PO가 불안해하는 특성을 아는 것과 느끼는 것이 다름을 보여주었습니다. 코드와 GitHub 링크도 포함되어 있습니다.

── EN ──────────────────

The article discusses fine-tuning an LLM to behave like C-3PO and testing different training data formats.

This article details an experiment fine-tuning an LLM to operate like C-3PO, testing how different data formats impact persona injection. Three formats were evaluated: chat demos, first-person statements, and synthetic Wikipedia-style documents. Surprisingly, first-person statements achieved the best generalization. Additionally, the synthetic document model revealed that knowing a trait and feeling it are distinct in weight space. A link to the code and GitHub repository is provided.

원문 보기 →목록으로