Backprompting: Leveraging Synthetic Production Data for Health Advice Guardrails
View PDF
HTML (experimental)
Abstract:The pervasiveness of large language models (LLMs) in enterprise settings has also brought forth a significant amount of risks associated with their usage. Guardrails technologies aim to mitigate this risk by filtering LLMs' input/output text through various detectors. However, developing and maintaining robust detectors faces many challenges, one of which is the difficulty in acquiring production-quality labeled data on real LLM outputs prior to deployment. ...
Read more at arxiv.org